IE11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

Legislative Event Explores How Agencies Using AI Can Protect Data

Once data is collected from consumers, who owns it? According to speakers from IBM and Atos, discussing the topic at a recent event organized through the Texas Legislature, collected data and insights always belong to the creator.

Data Security system Shield Protection Verification
Local and state governments looking to implement artificial intelligence in their systems, whether they be customer facing or back office, have a responsibility to protect the data of their constituents. In a panel organized by the Texas Legislature’s Innovation and Technology Caucus (IT Caucus) last week, Bala Vaithyalingam, a global leader for data and AI for IBM and Bogdan Udrea, head of digital workplace adoption at Atos, advised representatives on the responsible use and collection of data, who owns the collected data and what data AI systems should be granted access to.

Vaithyalingam began by stressing the importance of knowing where the data comes from when using generative AI.

“Every time you want to play in any system, you want to segregate PII [personally identifiable information] or domain-specific data sets from a foundation,” said Vaithyalingam. “That is a very fundamental principle … so that you have much more control on what domain-specific data you want to bring in based on the use case.”

Udrea expanded with his own principles related to responsible data usage and fair use.

“From my perspective, we follow four key principles,” said Udrea. “Use only the data that you have been explicitly allowed to use. Use only the data that you need for the use case, not more than that. Be very transparent about what you're actually going to be using the data [for], and probably the hardest one of all, do no harm.”

Once data is collected from consumers, who owns it? According to Vaithyalingam and Udrea, collected data and insights always belong to the creator.

In his explanation, Vaithyalingam split data into two categories: hard and soft. According to Vaithyalingam, hard data is typically provided by consumers either directly or through IoT (Internet of Things) devices such as mobile phones. Soft data are subjective insights based on impressions, such as geolocation and purchase patterns.

“There are a significant amount of insights that can be generated, such as whether I’m a coffee drinker, a lot of information I may or may not know about myself,” said Vaithyalingam. “So when you talk about data ownership, just because they use AI and they’ve got the insights, they cannot be the owner of the data. When there is hard data and soft data, ownership always belongs to the original creator or provider — in this case, the consumer.”

Regarding data governance and security, Vaithyalingam explained that understanding what an AI model is built on will help ensure that data used to train said system is kept secure from breaches and misuse.

“How do we find a curated foundation model? I think that is one of the fundamental questions that must be asked if you’re working on any generative AI project,” said Vaithyalingam. “Make sure [you know] what the model is built on. Is that a curated foundation model, [where you] remove all PII, copyrights and use of profanity? So again, going back to the data governance principle, there are certain security measures that need to be there, and once sensitive data is identified, then you need to apply the industry standard security principles to safeguard.”

Bogdan agreed and emphasized the importance of data accuracy in the face of potential data poisoning, which is when a bad actor purposefully injects unreliable data into AI models to control its predictive behavior.

“It goes back to you as an organization,” said Bogdan. “You have to be transparent about the source of the data, where it came from, because in that moment it’s very clear whether you're using accurate data … I think there’s going to be a massive growth in the industry [of companies] that are going to be able to not only validate if your data set is accurate or not, but validate whether the output that you have created is intended to do harm.”

As far as current legislation regarding AI regulation goes, Vaithyalingam argued that privacy warrants separate regulation. During another panel held during the event on AI’s role in local and state government, Texas CIO and Executive Director of the Department of Information Resources (DIR) Amanda Crawford and other industry experts echoed many of the same sentiments.

“Technology is not really the issue,” said Crawford. “It’s the policy, it’s the ethics, it’s the privacy, it’s the bias, it’s the protection of constitutional rights, it’s all of those things that are the hard part. With all candor, I don’t think that government gets it right and has a fully developed discipline around it.”

For state agencies looking to incorporate AI into their processes and systems but don’t know where to start, another panel of experts at the event shared their recommendations and insights into AI’s role in the workforce.
Chandler Treon is an Austin-based staff writer. He has a bachelor’s degree in English, a master’s degree in literature and a master’s degree in technical communication, all from Texas State University.