Search for:
Now Is the Time for Executives to Deploy Ethical Rules Around AI


For better or worse, AI is causing disruption in almost every field imaginable. Corporations around the world are embracing its possibilities to make work more efficient. The success of ChatGPT and other generative AI tools has also caught the attention of nearly every industry in an effort to meet profitability, efficiency, and sustainability goals. Money […]

The post Now Is the Time for Executives to Deploy Ethical Rules Around AI appeared first on DATAVERSITY.


Read More
Author: Usman Shuja

SQL at 50: A Lesson in How to Stay Relevant Around Data


Structured query language (SQL) is now 50 years old. The original paper for SQL (then called SEQUEL) was published in May 1974 by Raymond Boyce and Donald Chamberlin, and provided a guide for data manipulation based on a set of simple commands. Today, we take SQL for granted around data – it is still the third most […]

The post SQL at 50: A Lesson in How to Stay Relevant Around Data appeared first on DATAVERSITY.


Read More
Author: Dave Stokes

Six Data Quality Dimensions to Get Your Data AI-Ready
If you look at Google Trends, you’ll see that the explosion of searches for generative AI (GenAI) and large language models correlates with the introduction of ChatGPT back in November 2022. GenAI has brought hope and promise for those who have the creativity and innovation to dream big, and many have formulated impressive and pioneering […]


Read More
Author: Allison Connelly

Good Data Quality Is the Secret to Successful GenAI Implementation


You wouldn’t build a house without a concrete foundation. So why are many technology leaders attempting to adopt GenAI technologies before ensuring their data quality can be trusted? Reliable and consistent data is the bedrock of a successful AI strategy. Incomplete or inconsistent data prompts GenAI models to propose equally unreliable outputs, calling the basic […]

The post Good Data Quality Is the Secret to Successful GenAI Implementation appeared first on DATAVERSITY.


Read More
Author: Stephany Lapierre

Data-Driven Defense: AI as the New Frontier in Business Security


Major business setbacks due to risk management failures happen every year. They are also some of the costliest, adding up to millions of dollars in regulatory fines, lawsuits, payouts, and lost brand value. Leaders want to avoid these types of issues and rely on sound internal data management to mitigate risk and maintain confidence and […]

The post Data-Driven Defense: AI as the New Frontier in Business Security appeared first on DATAVERSITY.


Read More
Author: Prasad Sabbineni

Crossing the Data Divide: AI Data Assistants — A Data Leader’s Force Multiplier
The focus of my last column, titled Crossing the Data Divide: Data Catalogs and the Generative AI Wave, was on the impact of large language models (LLM) and generative artificial intelligence (AI) and how we disseminate knowledge throughout the enterprise and the future role of the data catalogs. Spoiler alert if you have not read […]


Read More
Author: John Wills

Maximizing Business Value with Generative AI


Have we ever seen something get adopted so quickly as generative AI (GenAI) compared to the past? Think about it: ChatGPT launched in 2022 and gained 100 million users in two months. In comparison, we have been hearing about AI for a few years, but the adoption rates of AI have varied from 25% to […]

The post Maximizing Business Value with Generative AI appeared first on DATAVERSITY.


Read More
Author: Chetan Alsisaria

Why It’s Time to Rethink Generative AI in the Enterprise


If you’ve been keeping an eye on the evolution of generative AI (GenAI) technology recently, you’re likely familiar with its core concepts: how GenAI models function, the art of crafting prompts, and the types of data GenAI models rely on. While these fundamental components within GenAI remain constant, the way they’re applied is transforming. The […]

The post Why It’s Time to Rethink Generative AI in the Enterprise appeared first on DATAVERSITY.


Read More
Author: Eamonn O’Neill

Why the Rise of LLMs and GenAI Requires a New Approach to Data Storage


The new wave of data-hungry machine learning (ML) and generative AI (GenAI)-driven operations and security solutions has increased the urgency for companies to adopt new approaches to data storage. These solutions need access to vast amounts of data for model training and observability. However, to be successful, ML pipelines must use data platforms that offer […]

The post Why the Rise of LLMs and GenAI Requires a New Approach to Data Storage appeared first on DATAVERSITY.


Read More
Author: Marty Kagan

Getting Ahead of Shadow Generative AI


Like any new technology, a lot of people are keen to use generative AI to help them in their jobs. Accenture research found that 89% of businesses think that using generative AI to make services feel more human will open up more opportunities for them. This will force change – Accenture also found that 86% […]

The post Getting Ahead of Shadow Generative AI appeared first on DATAVERSITY.


Read More
Author: Dom Couldwell

How to Effectively Prepare Your Data for Gen AI

Many organizations are prioritizing the deployment of generative AI for a number of mission-critical use cases. This isn’t surprising. Everyone seems to be talking about Gen AI, with some companies now moving forward with various applications.

While company leaders may be ready to unleash the power of Gen AI, their data may not be as ready. That’s because a lack of proper data preparation is setting up many organizations for costly and time-consuming setbacks.

However, when approached correctly, proper data prep can help accelerate and enhance Gen AI deployments. That’s why preparing data for Gen AI is essential, just like for other analytics, to avoid the “garbage in, garbage out” principle and to prevent skewed results.

As Actian shared in our presentation at the recent Gartner Data & Analytics Summit, there are both promises and pitfalls when it comes to Gen AI. That’s why you need to be skeptical about the hype and make sure your data is ready to deliver the Gen AI results you’re expecting.

Data Prep is Step One

We noted in our recent news release that comprehensive data preparation is the key to ensuring generative AI applications can do their job effectively and deliver trustworthy results. This is supported by the Gartner “Hype Cycle for Artificial Intelligence, 2023” that says, “Quality data is crucial for generative AI to perform well on specific tasks.”

In addition, Gartner explains that “Many enterprises attempt to tackle AI without considering AI-specific data management issues. The importance of data management in AI is often underestimated, so data management solutions are now being adjusted for AI needs.”

A lack of adequately prepared data is certainly not a new issue. For example, 70% of digital transformation projects fail because of hidden challenges that organizations haven’t thought through, according to McKinsey. This is proving true for Gen AI too—there are a range of challenges many organizations are not thinking about in their rush to deploy a Gen AI solution. One challenge is data quality, which must be addressed before making data available for Gen AI use cases.

What a New Survey Reveals About Gen AI Readiness

To gain insights into companies’ readiness for Gen AI, Actian commissioned research that surveyed 550 organizations in seven countries—70% of respondents were director level or higher. The survey found that Gen AI is being increasingly used for mission-critical use cases:

  • 44% of survey respondents are implementing Gen AI applications today.
  • 24% are just starting and will be implementing it soon.
  • 30% are in the planning or consideration stage.

The majority of respondents trust Gen AI outcomes:

  • 75% say they have a good deal or high degree of trust in the outcomes.
  • 5% say they do not have very much or not much trust in them.

It’s important to note that 75% of those who trust Gen AI outcomes developed that trust based on their use of other Gen AI solutions such as ChatGPT rather than their own deployments. This level of undeserved trust has the potential to lead to problems because users do not fully understand the risk that poor data quality poses to Gen AI outcomes in business.

It’s one issue if ChatGPT makes a typo. It’s quite another issue if business users are turning to Gen AI to write code, audit financial reports, create designs for physical products, or deliver after-visit summaries for patients—these high value use cases do not have a margin for error. It’s not surprising, therefore, that our survey found that 87% of respondents agree that data prep is very or extremely important to Gen AI outcomes.

Use Our Checklist to Ensure Data Readiness

While organizations may have a high degree of confidence in Gen AI, the reality is that their data may not be as ready as they think. As Deloitte notes in “The State of Generative AI in the Enterprise,” organizations may become less confident over time as they gain experience with the larger challenges of deploying generative AI at scale. “In other words, the more they know, the more they might realize how much they don’t know,” according to Deloitte.

This could be why only four percent of people in charge of data readiness say they were ready for Gen AI, according to Gartner’s “We Shape AI, AI Shapes Us: 2023 IT Symposium/Xpo Keynote Insights.” At Actian, we realize there’s a lot of competitive pressure to implement Gen AI now, which can prompt organizations to launch it without thinking through data and approaches carefully.

In our experience at Actian, there are many hidden risks related to navigating and achieving desired outcomes for Gen AI. Addressing these risks requires you to:

  • Ensure data quality and cleanliness
  • Monitor the accuracy of training data and machine learning optimization
  • Identify shifting data sets along with changing use case and business requirements over time
  • Map and integrate data from outside sources, and bring in unstructured data
  • Maintain compliance with privacy laws and security issues
  • Address the human learning curve

Actian can help your organization get your data ready to optimize Gen AI outcomes. We have a “Gen AI Data Readiness Checklist” that includes the results of our survey and also a strategic checklist to get your data prepped. You can also contact us and then our experts will help you find the fastest path to the Gen AI deployment that’s right for your business.

The post How to Effectively Prepare Your Data for Gen AI appeared first on Actian.


Read More
Author: Actian Corporation

Generative AI Challenges and Opportunities for Modern Enterprises


Generative AI (GenAI), machine learning (ML), and large language models (LLMs) are all becoming increasingly important to modern enterprises, but achieving measurable value from AI is still a challenge. Part of the issue is that a well-trained AI model relies on a large amount of data, and for many companies, organizing and making use of […]

The post Generative AI Challenges and Opportunities for Modern Enterprises appeared first on DATAVERSITY.


Read More
Author: Coral Trivedi

The Rise of Generative AI in Insurance


The global market for artificial intelligence (AI) in insurance is predicted to reach nearly $80 billion by 2032, according to Precedence Research. This growth is being driven by the increased adoption of AI within insurance companies, enhancing their operational efficiency, risk management, and customer engagement. Despite widespread integration of AI in the industry today, its full […]

The post The Rise of Generative AI in Insurance appeared first on DATAVERSITY.


Read More
Author: Stan Smith

The Future-Proof Data Preparation Checklist for Generative AI Adoption

Data preparation is a critical step in the data analysis workflow and is essential for ensuring the accuracy, reliability, and usability of data for downstream tasks. But as companies continue to struggle with data access and accuracy, and as data volumes multiply, the challenges of data silos and trust become more pronounced.

According to Ventana Research, data teams spend a whopping 69% of their time on data preparation tasks. Data preparation might be the least enjoyable part of their job, but the quality and cleanliness of data directly impacts analytics, insights, and decision-making. This also holds true for generative AI. The quality of your training data impacts the performance of gen AI models for your business.

High-Quality Input Data Leads to Better-Trained Models and Higher-Quality Generated Outputs

Generative AI models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), learn from patterns and structures present in the input data to generate new content. To train models effectively, data must be curated, transformed, and organized into a structured format, free from missing values, missing fields, duplicates, inconsistent formatting, outliers, and biases.

Without a doubt, data preparation tasks are a time-consuming and repetitive process. But, failure to adequately prepare data can result in suboptimal performance, biased outcomes, and ethical, legal, and practical challenges for generative AI applications.

Generative AI models lacking sufficient data preparation may face several challenges and limitations. Here are three major consequences:

Poor Quality Outputs

Generative AI models often require data to be represented in a specific format or encoding in a way that’s suitable for the modeling task. Without proper data preparation, the input data may contain noise, errors, or biases that negatively impact the training process. As a result, generative AI models may produce outputs that are of poor quality, lack realism, or contain artifacts and distortions.

Biased Outputs

Imbalanced datasets in which certain classes or categories are underrepresented, can lead to biased models and poor generalization performance. Data preparation ensures that the training data is free from noise, errors, and biases, which can adversely affect the model’s ability to learn and generate realistic outputs.

Compromised Ethics and Privacy

Generative AI models trained on sensitive or personal data must adhere to strict privacy and ethical guidelines. Data preparation involves anonymizing or de-identifying sensitive information to protect individuals’ privacy and comply with regulatory requirements, such as GDPR or HIPAA.

By following a systematic checklist for data preparation, data scientists can improve model performance, reduce bias, and accelerate the development of generative AI applications. Here are six steps to follow:

  1. Project Goals

  • Clearly outline the objectives and desired outcomes of the generative AI model so you can identify the types of data needed to train the model
  • Understand how the model will be utilized in the business context

  1. Data Collection

  • Determine and gather all potential sources of data relevant to the project
  • Consider structured and unstructured data from internal and external sources
  • Ensure data collection methods comply with relevant regulations and privacy policies (e.g. GDPR)
  1. Data Prep

  • Handle missing values, outliers, and inconsistencies in the data
  • Standardize data formats and units for consistency
  • Perform exploratory data analysis (EDA) to understand the characteristics, distributions, and patterns in the data
  1. Model Selection and Training

  • Choose an appropriate generative AI model architecture based on project requirements and data characteristics (e.g., GANs, VAEs, autoregressive models). Consider pre-trained models or architectures tailored to specific tasks
  • Train the selected model using the prepared dataset
  • Validate model outputs qualitatively and quantitatively. Conduct sensitivity analysis to understand model robustness
  1. Deployment Considerations

  • Prepare the model for deployment in the business environment
  • Optimize model inference speed and resource requirements
  • Implement monitoring mechanisms to track model performance in production
  1. Documentation and Reporting

  • Document all steps taken during data preparation, model development, and evaluation
  • Address concerns related to fairness, transparency, and privacy throughout the project lifecycle
  • Communicate findings and recommendations to stakeholders effectively for full transparency into processes

Data preparation is a critical step for generative AI because it ensures that the input data is of high quality, appropriately represented, and well-suited for training models to generate realistic, meaningful and ethically responsible outputs. By investing time and effort in data preparation, organizations can improve the performance, reliability, and ethical implications of their generative AI applications.

Actian Data Preparation for Gen AI

The Actian Data Platform comes with unified data integration, warehousing and visualization in a single platform. It includes a comprehensive set of capabilities for preprocessing, transformations, enrichment, normalization and serialization of structured, semi-structured and unstructured data such as JSON/XML, delimited files, RDBMS, JDBC/ODBC, HBase, Binary, ORC, ARFF, Parquet and Avro.

At Actian, our mission is to enable data engineers, data scientists and data analysts to work with high-quality, reliable data, no matter where it lives. We believe that when data teams focus on delivering comprehensive and trusted data pipelines, business leaders can truly benefit from groundbreaking technologies, such as gen AI.

The best way for artificial intelligence and machine learning (AI/ML) data teams to get started is with a free trial of the Actian Data Platform. From there, you can load your own data and explore what’s possible within the platform. Alternatively, book a demo to see how Actian can help automate data preparation tasks in a robust, scalable, price-performant way.

Meet our Team at the Gartner Data & Analytics Summit 2024 

Join us for Gartner Data & Analytics Summit 2024, March 11 – 13, in Orlando, FL., where you’ll receive a step-by-step guide on readying your data for Gen AI adoption. Check out our session, “Don’t Fall for the Hype: Prep Your Data for Gen AI” on Thursday, March 12 at 1:10pm at the Dolphin Hotel, Atlantic Hall, Theater 3.

The post The Future-Proof Data Preparation Checklist for Generative AI Adoption appeared first on Actian.


Read More
Author: Dee Radh

Data Preparation Guide: 6 Steps to Deliver High Quality Gen AI Models

Data preparation is a critical step in the data analysis workflow and is essential for ensuring the accuracy, reliability, and usability of data for downstream tasks. But as companies continue to struggle with data access and accuracy, and as data volumes multiply, the challenges of data silos and trust become more pronounced.

According to Ventana Research, data teams spend a whopping 69% of their time on data preparation tasks. Data preparation might be the least enjoyable part of their job, but the quality and cleanliness of data directly impacts analytics, insights, and decision-making. This also holds true for generative AI. The quality of your training data impacts the performance of gen AI models for your business.

High-Quality Input Data Leads to Better-Trained Models and Higher-Quality Generated Outputs

Generative AI models, such as Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), learn from patterns and structures present in the input data to generate new content. To train models effectively, data must be curated, transformed, and organized into a structured format, free from missing values, missing fields, duplicates, inconsistent formatting, outliers, and biases.

Without a doubt, data preparation tasks are a time-consuming and repetitive process. But, failure to adequately prepare data can result in suboptimal performance, biased outcomes, and ethical, legal, and practical challenges for generative AI applications.

Generative AI models lacking sufficient data preparation may face several challenges and limitations. Here are three major consequences:

Poor Quality Outputs

Generative AI models often require data to be represented in a specific format or encoding in a way that’s suitable for the modeling task. Without proper data preparation, the input data may contain noise, errors, or biases that negatively impact the training process. As a result, generative AI models may produce outputs that are of poor quality, lack realism, or contain artifacts and distortions.

Biased Outputs

Imbalanced datasets in which certain classes or categories are underrepresented, can lead to biased models and poor generalization performance. Data preparation ensures that the training data is free from noise, errors, and biases, which can adversely affect the model’s ability to learn and generate realistic outputs.

Compromised Ethics and Privacy

Generative AI models trained on sensitive or personal data must adhere to strict privacy and ethical guidelines. Data preparation involves anonymizing or de-identifying sensitive information to protect individuals’ privacy and comply with regulatory requirements, such as GDPR or HIPAA.

By following a systematic checklist for data preparation, data scientists can improve model performance, reduce bias, and accelerate the development of generative AI applications. Here are six steps to follow:

  1. Project Goals

  • Clearly outline the objectives and desired outcomes of the generative AI model so you can identify the types of data needed to train the model
  • Understand how the model will be utilized in the business context

  1. Data Collection

  • Determine and gather all potential sources of data relevant to the project
  • Consider structured and unstructured data from internal and external sources
  • Ensure data collection methods comply with relevant regulations and privacy policies (e.g. GDPR)
  1. Data Prep

  • Handle missing values, outliers, and inconsistencies in the data
  • Standardize data formats and units for consistency
  • Perform exploratory data analysis (EDA) to understand the characteristics, distributions, and patterns in the data
  1. Model Selection and Training

  • Choose an appropriate generative AI model architecture based on project requirements and data characteristics (e.g., GANs, VAEs, autoregressive models). Consider pre-trained models or architectures tailored to specific tasks
  • Train the selected model using the prepared dataset
  • Validate model outputs qualitatively and quantitatively. Conduct sensitivity analysis to understand model robustness
  1. Deployment Considerations

  • Prepare the model for deployment in the business environment
  • Optimize model inference speed and resource requirements
  • Implement monitoring mechanisms to track model performance in production
  1. Documentation and Reporting

  • Document all steps taken during data preparation, model development, and evaluation
  • Address concerns related to fairness, transparency, and privacy throughout the project lifecycle
  • Communicate findings and recommendations to stakeholders effectively for full transparency into processes

Data preparation is a critical step for generative AI because it ensures that the input data is of high quality, appropriately represented, and well-suited for training models to generate realistic, meaningful and ethically responsible outputs. By investing time and effort in data preparation, organizations can improve the performance, reliability, and ethical implications of their generative AI applications.

Actian Data Preparation for Gen AI

The Actian Data Platform comes with unified data integration, warehousing and visualization in a single platform. It includes a comprehensive set of capabilities for preprocessing, transformations, enrichment, normalization and serialization of structured, semi-structured and unstructured data such as JSON/XML, delimited files, RDBMS, JDBC/ODBC, HBase, Binary, ORC, ARFF, Parquet and Avro.

At Actian, our mission is to enable data engineers, data scientists and data analysts to work with high-quality, reliable data, no matter where it lives. We believe that when data teams focus on delivering comprehensive and trusted data pipelines, business leaders can truly benefit from groundbreaking technologies, such as gen AI.

The best way for artificial intelligence and machine learning (AI/ML) data teams to get started is with a free trial of the Actian Data Platform. From there, you can load your own data and explore what’s possible within the platform. Alternatively, book a demo to see how Actian can help automate data preparation tasks in a robust, scalable, price-performant way.

Meet our Team at the Gartner Data & Analytics Summit 2024 

Join us for Gartner Data & Analytics Summit 2024, March 11 – 13, in Orlando, FL., where you’ll receive a step-by-step guide on readying your data for Gen AI adoption. Check out our session, “Don’t Fall for the Hype: Prep Your Data for Gen AI” on Thursday, March 12 at 1:10pm at the Dolphin Hotel, Atlantic Hall, Theater 3.

The post Data Preparation Guide: 6 Steps to Deliver High Quality Gen AI Models appeared first on Actian.


Read More
Author: Dee Radh

Ask a Data Ethicist: What Data Is OK to Use to Prompt ChatGPT?


Millions of people are using ChatGPT on a regular basis to assist them in both personal and professional capacities. This month’s question centers on the data used to prompt ChatGPT.  Our reader, who shared that they are an ESL speaker, would like to know if it’s ethical to have ChatGPT create summaries of information (specifically, […]

The post Ask a Data Ethicist: What Data Is OK to Use to Prompt ChatGPT? appeared first on DATAVERSITY.


Read More
Author: Katrina Ingram

How GenAI Bridges the Data Gap Between CMOs and CFOs


Marketing budgets are never entirely safe. While it may seem like pressure is easing as global economic estimates turn slightly sunnier, consumer demand is still getting more expensive to capture and close – which means scrutiny from finance chiefs is as tough as ever. To keep investment flowing, CMOs need to get better at not only boosting […]

The post How GenAI Bridges the Data Gap Between CMOs and CFOs appeared first on DATAVERSITY.


Read More
Author: Harriet Durnford-Smith

Data Speaks for Itself: Is AI the Cure for Data Curation?
By now, it is clear to everyone that AI, especially generative AI, is the only topic you’re allowed to write about. It seems to have impacted every area of information technology, so, I will try my best to do my part. However, when it comes to data curation and data quality management, there seems to […]


Read More
Author: Dr. John Talburt

Organizations Are Underutilizing Their Data – Here’s Why (and How to Fix It)


The mainstreaming of predictive analytics and generative AI has brought Data Management into focus. Artificial Intelligence both runs on and produces a vast amount of data that must be effectively managed, governed, and analyzed. However, a recent survey of 1,000 North American executives revealed that organizations aren’t quite up for the challenge. 

Many companies are not prepared to implement AI or other technologies into their existing IT infrastructure and workforce in a timely manner…

The post Organizations Are Underutilizing Their Data – Here’s Why (and How to Fix It) appeared first on DATAVERSITY.


Read More
Author: Tanvir Khan

Why Organizations Are Transitioning from OpenAI to Fine-Tuned Open-Source Models


In the rapidly evolving generative AI landscape, OpenAI has revolutionized the way developers build prototypes, create demos, and achieve remarkable results with large language models (LLMs). However, when it’s time to put LLMs into production, organizations are increasingly moving away from commercial LLMs like OpenAI in favor of fine-tuned open-source models. What’s driving this shift, and why are developers embracing it? The primary motivations are simple…

The post Why Organizations Are Transitioning from OpenAI to Fine-Tuned Open-Source Models appeared first on DATAVERSITY.


Read More
Author: Devvret Rishi

AI’s Massive Growth Puts Retail Data in the Spotlight


023 was an incredible year in the development of artificial intelligence (AI). With the massive adoption of technologies like ChatGPT, millions of people are now uncovering new ways to use AI to create content, including email, video animation, and even code.

Since its first debut to the public in 2022, generative AI has dominated headlines and conversations about its potential impact on nearly every aspect of business and life…

The post AI’s Massive Growth Puts Retail Data in the Spotlight appeared first on DATAVERSITY.


Read More
Author: Nicola Kinsella

The AI Playbook: Providing Important Reminders to Data Professionals
Eric Siegel’s “The AI Playbook” serves as a crucial guide, offering important insights for data professionals and their internal customers on effectively leveraging AI within business operations. The book, which comes out on February 6th, and its insights are captured in six statements: — Determine the value— Establish a prediction goal— Establish evaluation metrics— Prepare […]


Read More
Author: Myles Suer

The Promise and Potential of Human-Like Generative AI Chatbots


They’ve passed SATs, graduate records exams, and medical licensing exams, and programmers have used them to solve obscure coding challenges in seconds. Undoubtedly, generative AI chatbots’ capabilities are astounding, but this doesn’t mean they get it right every time.  Despite their success in providing contextually relevant and, for the most part, accurate answers, can generative […]

The post The Promise and Potential of Human-Like Generative AI Chatbots appeared first on DATAVERSITY.


Read More
Author: Nate MacLeitch

The Rise of RAG-Based LLMs in 2024


As we step into 2024, one trend stands out prominently on the horizon: the rise of retrieval-augmented generation (RAG) models in the realm of large language models (LLMs). In the wake of challenges posed by hallucinations and training limitations, RAG-based LLMs are emerging as a promising solution that could reshape how enterprises handle data. The surge […]

The post The Rise of RAG-Based LLMs in 2024 appeared first on DATAVERSITY.


Read More
Author: Kyle Kirwan