Search for:
Empowering Organizations Through Data Literacy, Governance, and Business Literacy
In my journey as a data management professional, Iā€™ve come to believe that the road to becoming a truly data-centric organization is paved with more than just tools and policies ā€” itā€™s about creating a culture where data literacy and business literacy thrive.Ā  Data governance, long regarded as a compliance-driven function, is now the backbone [ā€¦]


Read More
Author: Gopi Maren

Identifying and Addressing Data Overload
Increased data generation requires modern businesses to manage vast volumes of information. All this data holds immense potential for insights and informed decision-making, but its value depends on effective utilization. Without the right tools, frameworks, and strategies, even established companies risk being overwhelmed by data overload.Ā  Letā€™s take a closer look at data overload and [ā€¦]


Read More
Author: Irfan Gowani

Data Professional Introspective: Your Organization Canā€™t Create an EDM Strategy
Some countries successfully create long-term strategic plans. For example, Chinaā€™s first 100-year plan was aimed at the elimination of extreme poverty by 2020. In 1980, there were 540M people living in extreme poverty; by 2014, there were only 80 million. The second 100-year plan, targeted for 2050, calls for achieving 30% of global GDP, to [ā€¦]


Read More
Author: Melanie Mecca

The Data-Centric Revolution: Putting Knowledge Into Our Knowledge Graphs
I recently gave a presentation called ā€œKnowledge Management and Knowledge Graphsā€ at a KMWorld conference, and a new picture of the relationship between knowledge management and knowledge graphs gradually came into focus. I recognized that the knowledge graph community has gotten quite good at organizing and harmonizing data and information, but there is little knowledge [ā€¦]


Read More
Author: Dave McComb

The Challenges of Data Migration: Ensuring Smooth Transitions Between Systems
Data migration ā€” the process of transferring data from one system to another ā€” is a critical undertaking for organizations striving to upgrade infrastructure, consolidate systems, or adopt new technologies.Ā Ā  However, data migration challenges can be very complex, especially when doing large-scale data migration projects.Ā Ā  Duplicate or missing data, system compatibility issues, data security problems, [ā€¦]


Read More
Author: Ainsley Lawrence

Legal Issues for Data Professionals: In AI, Data Itself Is the Supply Chain
Data is the supply chain for AI. For generative AI, even in fine-tuned, company-specific large language models, the data that is input into training data comes from a host of different sources. If the data from any given source is unreliable, then the training data will be deficient and the LLM output will be untrustworthy. [ā€¦]


Read More
Author: William A. Tanenbaum and Isaac Greaney

Becoming a Citizen Data Scientist Can Improve Career Opportunities
When a business decides to undertake a data democratization initiative, improve data literacy, and create a role for citizen data scientists, the management team often assumes that business users will be eager to participate, and that assumption can cause these initiatives to fail.Ā  Like every other cultural shift within an organization, the management team must [ā€¦]


Read More
Author: Kartik Patel

User-Friendly External Smartblobs Using a Shadow Directory

I am very excited about the HCL InformixĀ® 15 external smartblob feature.

If you are not familiar with them, external smartblobs allow the user to store actual Binary Large Object (blob) and Character Large Object (clob) data external to the database. Metadata about that external storage is maintained by the database.

Notes: This article does NOT discuss details of the smartblobs feature itself, but rather proposes a solution to make the functionality more user-friendly. For details on feature behavior, setup, and new functions, see the documentation.

At the writing of this blog, v15.0 does not have the ifx_lo_path function defined, as required below.Ā  This has been reported to engineering.Ā  The workaround is to create it yourself with the following command:

create dba function ifx_lo_path(blob)
Ā Ā returns lvarchar
Ā Ā external name '(sq_lo_path)'
Ā Ā language C;

This article also does not discuss details of client programming required to INSERT blobs and clobs into the database.

The external smartblob feature was built for two main reasons:

1. Backup size

Storing blobs in the database itself can cause the database to become extremely large. As such, performing backups on the database takes an inordinate amount of time, and 0 level backups can be impossible. Offloading the actual blob contents to an external file system can lessen the HCL Informix backup burden by putting the blob data somewhere else. The database still governs the storage of, and access to, the blob, but the physical blob is housed elsewhere/externally.

2. Easy access to blobs

Users would like easy access to blob data, with familiar tools, without having to go through the database.Ā 

Using External Smartblobs in HCL Informix 15

HCL Informix 15 introduces external smartblobs. When you define an external smartblob space, you specify the external directory location (outside the database) where you would like the actual blob data to be stored. Then you assign blob column(s) to that external smartblob space when you CREATE TABLE. When a row is INSERTed, HCL Informix stores the blob data in the defined directory using an internal identifier for the filename.

Hereā€™s an example of a customer forms table: custforms (denormalized and hardcoded for simplicity). My external sbspace directory is /home/informix/blog/resources/esbsp_dir1.

CREATE TABLE custforms(formid SERIAL, company CHAR(20), year INT, lname CHAR(20), 
formname CHAR(50), form CLOB) PUT form IN (esbsp);

Here, I INSERT a 2023 TaxForm123 document from a Java program for a woman named Sanchez, who works for Actian:

try(PreparedStatement p = c.prepareStatement("INSERT INTO custforms 
(company, year, lname, formname, form) values(?,?,?,?,?)");

FileInputStream is = new FileInputStream("file.xml")) {
p.setString(1, "Actian");
p.setString(2, "2023");
p.setString(3, "Sanchez");
p.setString(4, "TaxForm123");
p.setBinaryStream(5, is);
p.executeUpdate();
}

After I INSERT this row, my external directory and file would look like this:

[informix@schma01-rhvm03 resources]$ pwd
/home/informix/blog/resources
[informix@schma01-rhvm03 resources]$ ls -l esbsp*
-rw-rw---- 1 informix informix 10240000 Oct 17 13:22 esbsp_chunk1

esbsp_dir1:
total 0
drwxrwx--- 2 informix informix 41 Oct 17 13:19 IFMXSB0
[informix@schma01-rhvm03 resources]$ ls esbsp_dir1/IFMXSB0
LO[2,2,1(0x102),1729188125]

Where LO[2,2,1(0x102),1729188125]is an actual file that contains the data that I could access directly. The problem is that if I want to directly access this file for Ms. Sanchez, I would first have to figure out that this file belongs to her and is the tax document I want. Itā€™s very cryptic!

A User-Friendly Smartblob Solution

When talking to Informix customers, they love the new external smartblobs feature but wish it could be a little more user-friendly.

As in the above example, instead of putting Sanchezā€™s 2023 TaxForm123 into a general directory called IFMXSB0 in a file called LO[2,2,1(0x102),1729188125, which together are meaningless to an end-user, wouldnā€™t it be nice if the file was located in an intuitive place like /home/forms/Actian/2024/TaxForm123/Sanchez.xml or something similarā€¦something meaningfulā€¦how YOU want it organized?

Having HCL Informix automatically do this is a little easier said than done, primarily because the database would not intuitively know how any one customer would want to organize their blobs. What exact directory substructure? From what column or columns do I form the file names? What order? All use cases would be different.

Leveraging a User-Friendly Shadow Directory

The following solution shows how you can create your own user-friendly logical locations for your external smartblobs by automatically maintaining a lightweight shadow directory structure to correspond to actual storage locations. The solution uses a very simple system of triggers and stored procedures to do this.

Note: Examples here are shown on Linux, but other UNIX flavors should work also.

How to Set Up in 4 Steps

For each smartblob column in question

STEP 1: Decide how you want to organize access to your files.

Decide what you want the base of your shadow directory to be and create it. In my case for this blog, it is: /home/informix/blog/resources/user-friendly. You could probably implement this solution without a set base directory (as seen in the examples), but that may not be a good idea because users would unknowingly start creating directories everywhere.

STEP 2: Create a create_link stored procedure and corresponding trigger for INSERTs.

This procedure makes sure that the desired data-driven subdirectory structure exists from the base (mkdir -p), then forms a user-friendly logical link to the Informix smartblob file.Ā Ā Ā  You must pass all the columns to this procedure from which you want to form the directory structure and filename from the trigger.

CREATE PROCEDURE

CREATE PROCEDURE create_link (p_formid INT, p_company CHAR(20), p_year INT,
p_lname CHAR(20), p_formname CHAR(50))
DEFINE v_oscommand CHAR(500);
DEFINE v_custlinkname CHAR(500);
DEFINE v_ifmxname CHAR(500);
DEFINE v_basedir CHAR(100);
-- set the base directory
LET v_basedir = '/home/informix/blog/resources/user-friendly';
-- make sure directory tree exists
LET v_oscommand = 'mkdir -p ' || TRIM(v_basedir) || '/' || TRIM(p_company) || '/' || 
TO_CHAR(p_year);
SYSTEM v_oscommand; 

-- form full link name 
LET v_custlinkname = TRIM(v_basedir) || '/' || TRIM(p_company) || '/' || TO_CHAR(p_year) 
|| '/' || TRIM(p_lname) || '.' || TRIM(p_formname) || '.' || TO_CHAR(p_formid);

-- get the actual location 
SELECT IFX_LO_PATH(form::LVARCHAR) INTO v_ifmxname FROM custforms WHERE formid = p_formid; 

-- create the os link 
LET v_oscommand = 'ln -s -f ' || '''' || TRIM(v_ifmxname) || '''' || ' ' || v_custlinkname; 
SYSTEM v_oscommand;

END PROCEDURE

CREATE TRIGGER

CREATE TRIGGER ins_tr INSERT ON custforms REFERENCING new AS post
FOR EACH ROW(EXECUTE PROCEDURE create_link (post.formid, post.company,
post.year, post.lname, post.formname));

STEP 3: Create a delete_link stored procedure and corresponding trigger for DELETEs.

This procedure will delete the shadow directory link if the row is deleted.

CREATE PROCEDURE

CREATE PROCEDURE delete_link (p_formid INT, p_company CHAR(20), p_year INT,
p_lname CHAR(20), p_formname CHAR(50))
DEFINE v_oscommand CHAR(500);
DEFINE v_custlinkname CHAR(500); 
DEFINE v_basedir CHAR(100);
-- set the base directory
LET v_basedir = '/home/informix/blog/resources/user-friendly';
-- form full link name
LET v_custlinkname = TRIM(v_basedir) || '/' ||
TRIM(p_company) || '/' || TO_CHAR(p_year) || '/' || TRIM(p_lname) || '.'
|| TRIM(p_formname) || '.' || TO_CHAR(p_formid);
-- remove the link
LET v_oscommand = 'rm -f -d ' || v_custlinkname;
SYSTEM v_oscommand;

END PROCEDURE

CREATE TRIGGER

CREATE TRIGGER del_tr DELETE ON custforms REFERENCING old AS pre FOR EACH ROW
(EXECUTE PROCEDURE delete_link (pre.formid, pre.company, pre.year, pre.lname, pre.formname));

STEP 4: Create a change_link stored procedure and corresponding trigger for UPDATEs, if desired.Ā Ā  In my example, Ms. Sanchez might get married to Mr. Simon and an UPDATE to her last name in the database occurs. I may then want to change all my user-friendly names from Sanchez to Simon.Ā  This procedure deletes the old link and creates a new one.

Notice the update trigger only must fire on the columns that form your directory structure and filenames.

CREATE PROCEDURE

CREATE PROCEDURE change_link (p_formid INT, p_pre_company CHAR(20), 
p_pre_year INT, p_pre_lname CHAR(20), p_pre_formname CHAR(50), p_post_company CHAR(20), 
p_post_year INT, p_post_lname CHAR(20), p_post_formname CHAR(50))

DEFINE v_oscommand CHAR(500);
DEFINE v_custlinkname CHAR(500);
DEFINE v_ifmxname CHAR(500);
DEFINE v_basedir CHAR(100);
-- set the base directory
LET v_basedir = '/home/informix/blog/resources/user-friendly';

-- get rid of old

-- form old full link name
LET v_custlinkname = TRIM(v_basedir) || '/' || TRIM(p_pre_company) || '/' || 
TO_CHAR(p_pre_year) || '/' || TRIM(p_pre_lname) || '.' || TRIM(p_pre_formname) || '.' 
|| TO_CHAR(p_formid) ;

-- remove the link and empty directories
LET v_oscommand = 'rm -f -d ' || v_custlinkname;
SYSTEM v_oscommand;

-- form the new
-- make sure directory tree exists
LET v_oscommand = 'mkdir -p ' || TRIM(v_basedir) || '/' || TRIM(p_post_company) || '/' || 
TO_CHAR(p_post_year);
SYSTEM v_oscommand;

-- form full link name
LET v_custlinkname = TRIM(v_basedir) || '/' || TRIM(p_post_company) || '/' || 
TO_CHAR(p_post_year) || '/' || TRIM(p_post_lname) || '.' || TRIM(p_post_formname) 
|| '.' || TO_CHAR(p_formid) ;

-- get the actual location
-- this is the same as before as id has not changed
SELECT IFX_LO_PATH(form::LVARCHAR) INTO v_ifmxname FROM custforms WHERE formid = p_formid;

-- create the os link
LET v_oscommand = 'ln -s -f ' || '''' || TRIM(v_ifmxname) || '''' || ' ' || v_custlinkname;
SYSTEM v_oscommand;

END PROCEDURE

CREATE TRIGGER

CREATE TRIGGER upd_tr UPDATE OF formid, company, year, lname, formname ON custforms
REFERENCING OLD AS pre NEW as post

FOR EACH ROW(EXECUTE PROCEDURE change_link (pre.formid, pre.company, pre.year, pre.lname, 
pre.formname, post.company, post.year, post.lname, post.formname));

Results Example

Back to our example.

With this infrastructure in place, now in addition to the Informix-named file being in place, I would have these user-friendly links on my file system that I can easily locate and identify.

INSERT

[informix@schma01-rhvm03 2023]$ pwd
/home/informix/blog/resources/user-friendly/Actian/2023
[informix@schma01-rhvm03 2023]
$ ls Sanchez.TaxForm123.2

If I do an ls -l, you will see that it is a link to the Informix blob file.

[informix@schma01-rhvm03 2023]$ ls -l
total 0
lrwxrwxrwx 1 informix informix 76 Oct 17 14:20 Sanchez.TaxForm123.2 -> 
/home/informix/blog/resources/esbsp_dir1/IFMXSB0/LO[2,2,1(0x102),1729188126]

UPDATE

If I then update her last name with UPDATE custforms SET lname = ā€˜Simonā€™ where formid=2,my file system now looks like this:

[informix@schma01-rhvm03 2023]$ ls -l
lrwxrwxrwx 1 informix informix 76 Oct 17 14:25 Simon.TaxForm123.2 -> 
/home/informix/blog/resources/esbsp_dir1/IFMXSB0/LO[2,2,1(0x102),1729188126]

DELETE

If I then go and DELETE this form with DELETE FROM custforms where formid=2, my directory structure looks like this:

[informix@schma01-rhvm03 2023]$ pwd
/home/informix/blog/resources/user-friendly/Actian/2023
[informix@schma01-rhvm03 2023]$ ls
[informix@schma01-rhvm03 2023]$

We Welcome Your Feedback

Please enjoy the new HCL Informix15 external smartblob feature.

I hope this idea can make external smartblobs easier for you to use. If you have any feedback on the idea, especially on enhancements or experience in production, please feel free to contact me at mary.schulte@hcl-software.com. I look forward to hearing from you!

Find out more about the launch of HCL Informix 15.

Notes

1. Shadow directory permissions. In creating this example, I did not explore directory and file permissions, but rather just used general permissions settings on my sandbox server. Likely, you will want to control permissions to avoid some of the anomalies I discuss below.

2. Manual blob file delete. With external smartblobs, if permissions are not controlled, it is possible that a user might somehow delete the physical smartblob file itself from its directory. HCL Informix, itself, cannot control this from happening. In the event it does happen, HCL Informix does NOT delete the corresponding row; the blob file will just be missing. There may be aspects to links that can automatically handle this, but I have not investigated them for this blog.

3. Link deletion in the shadow directory. If permissions are not controlled, it is possible that a user might delete a logical link formed by this infrastructure. This solution does not detect this. If this is an issue, I would suggest a periodic maintenance job that cross references the shadow directory links to blob files to detect missing links. For those blobs with missing links, write a database program to look up the rowā€™s location with the IFX_LO_PATH function, and reform the missing link.

4. Unique identifiers. I highly recommend using unique identifiers in this solution. In this simple example, I used formid. You donā€™t want to clutter things up, of course, but depending on how you structure your shadow directories and filenames, you may need to include more unique identifiers to avoid directory and link names duplication.

5. Empty directories. I did not investigate if there are options to rm in the delete stored procedure to clean up empty directories that might remain if a last item is deleted.

6. Production overhead. It is known that excessive triggers and stored procedures can add overhead to a production environment. For this blog, it is assumed that OLTP activity on blobs is not excessive, therefore production overhead should not be an issue. This being said, this solution has NOT been tested at scale.

7. NULL values. Make sure to consider the presence and impact of NULL values in columns used in this solution. For simplicity, I did not handle them here.

Informix is a trademark of IBM Corporation in at least one jurisdiction and is used under license.

Ā 

The post User-Friendly External Smartblobs Using a Shadow Directory appeared first on Actian.


Read More
Author: Mary Schulte

AI Predictions for 2025: Embracing the Future of Human and Machine Collaboration


Predictions are funny things. They often seem like a bold gamble, almost like trying to peer into the future with the confidence we inherently lack as humans. Technologyā€™s rapid advancement surprises even the most seasoned experts, especially when it progresses exponentially, as it often does. As physicist Albert A. Bartlett famously said, ā€œThe greatest shortcoming [ā€¦]

The post AI Predictions for 2025: Embracing the Future of Human and Machine Collaboration appeared first on DATAVERSITY.


Read More
Author: Philip Miller

Data Monetization: The Holy Grail or the Road to Ruin?


Unlocking the value of data is a key focus for business leaders, especially the CIO. While in its simplest form, data can lead to better insights and decision-making, companies are pursuing an entirely different and more advanced agenda: the holy grail of data monetization. This concept involves aggregating a variety of both structured and unstructured [ā€¦]

The post Data Monetization: The Holy Grail or the Road to Ruin? appeared first on DATAVERSITY.


Read More
Author: Tony Klimas

Beyond Ownership: Scaling AI with Optimized First-Party Data


Brands, publishers, MarTech vendors, and beyond recently gathered in NYC for Advertising Week and swapped ideas on the future of marketing and advertising.Ā The overarching message from many brands was one weā€™ve heard before: First-party data is like gold, especially for personalization. But it takes more than ā€œowningā€ the data to make it valuable.Ā Scale and accuracy [ā€¦]

The post Beyond Ownership: Scaling AI with Optimized First-Party Data appeared first on DATAVERSITY.


Read More
Author: Tara DeZao

Accelerating Innovation: Data Discovery in Manufacturing

The manufacturing industry is in the midst of a digital revolution. Youā€™ve probably heard these buzzwords: Industry 4.0, IoT, AI, and machine learningā€“ all terms that promise to revolutionize everything from assembly lines to customer service. Embracing this digital transformation is key in improving your competitive advantage, but new technology doesnā€™t come without its own challenges. Each new piece of technology needs one thing to deliver innovation: data.

Data is the fuel powering your tech engines. Without the ability to understand where your data is, whether itā€™s trustworthy, or who owns the datasets, even the most powerful tools can overcomplicate and confuse the best data teams. Thatā€™s where modern data discovery solutions come in. Theyā€™re like the backstage crew making sure everything runs smoothlyā€“ connecting systems, tidying up the data mess, and making sure everyone has exactly what they need, when they need it. That means faster insights, streamlined operations, and a lower total cost of ownership (TCO). In other words, data access is the key to staying ahead in todayā€™s fast-paced, highly competitive, increasingly sensitive manufacturing market.Ā 

The Problem

Data from all aspects of your business is siloedā€“ whether itā€™s coming from sensors, legacy systems, cloud applications, suppliers or customersā€“ trying to piece it all together is daunting, time-consuming, and just plain hard. Traditional methods are slow, cumbersome, and definitely not built for todayā€™s needs. This fragmented approach not only slows down decision-making, but keeps you from tapping into valuable insights that could drive innovation. And in a market where speed is everything, thatā€™s a recipe for falling behind.Ā 

So the big question is: how can you unlock the true potential of your data?

The Solution

So how do you make data intelligence into a streamlined, efficient process? The answer lies in modern data discovery solutionsā€“ the unsung catalyst of a digital transformation motion. Rather than simply integrating data sources, data discovery solutions excel in metadata management, offering complete visibility into your companyā€™s data ecosystem. They enable usersā€“ regardless of skill levelā€“ to locate where data resides and assess the quality and relevance of the information. By providing this detailed understanding of data context and lineage, organizations can confidently leverage accurate, trustworthy datasets, paving the way for informed decision-making and innovation,Ā 

Key Components

Easy-to-Connect Data Sources for Metadata Management

Ā One of the biggest hurdles in data integration is connecting to a variety of data sources, including legacy systems, cloud applications, and IoT devices. Modern data discovery tools like Zeenea offer easy connectivity, allowing you to extract metadata from various sources seamlessly. This unified view eliminates silos and enables faster, more informed decision-making across the organization.

Advanced Metadata Management

Metadata is the backbone of effective data discovery. Advanced metadata management capabilities ensure that data is well-organized, tagged, and easily searchable. This provides a clear context for data assets, helping you understand the origin, quality, and relevance of your data. This means better data search and discoverability.

Data Discovery Knowledge Graph

A data discovery knowledge graph serves as an intelligent map of your metadata, illustrating the intricate relationship and connections between data assets. It provides users with a comprehensive view of how data points are linked across systems, offering a clear picture of data lineageā€“ from origin to current state. The visibility into the data journey is invaluable in manufacturing, where understanding the flow of information between production data, supply chain metrics, and customer feedback is critical. By tracing the lineage of data, you can quickly assess its accuracy, relevance, and context, leading to more precise insights and informed decision-making.

Quick Access to Quality Data Through Data Marketplace

A data marketplace provides a centralized hub where you can easily search, discover, and access high-quality data. This self-service model empowers your teams to find the information they need without relying on IT, accelerating time to insight. The result? Faster product development cycles, improved process efficiency, and enhanced decision-making capabilities.

User-Friendly Interface With Natural Language Search

Modern data discovery platforms prioritize user experience with intuitive, user-friendly interfaces. Features like natural language search allow users to query data using everyday language, making it easier for non-technical users to find what they need. This democratizes access to data across the organization, fostering a culture of data-driven decision-making.

Low Total Cost of Ownership (TCO)

Traditional metadata management solutions often come with a hefty price tag due to high infrastructure costs and ongoing maintenance. In contrast, modern data discovery tools are designed to minimize TCO with automated features, cloud-based deployment, and reduced need for manual intervention. This means more efficient operations and a greater return on investment.

Benefits

By leveraging a comprehensive data discovery solution, manufacturers can achieve several key benefits:

Enhanced Innovation

With quick access to quality data, teams can identify trends and insights that drive product development and process optimization.

Faster Time to Market

Automated implementation and seamless data connectivity reduce the time required to gather and analyze data, enabling faster decision-making.

Improved Operational Efficiency

Advanced metadata management and knowledge graphs help streamline data governance, ensuring that users have access to reliable, high-quality data.

Increased Competitiveness

A user-friendly data marketplace democratizes data access, empowering teams to make data-driven decisions and stay ahead of industry trends.

Cost Savings

With low TCO and reduced dependency on manual processes, manufacturers can maximize their resources and allocate budgets towards strategic initiatives.

Data is more than just a resourceā€”itā€™s a catalyst for innovation. By embracing advanced metadata management and data discovery solutions, you can find, trust, and access data. This not only accelerates time to market but also drives operational efficiency and boosts competitiveness. With powerful features like API-led automation, a data discovery knowledge graph, and an intuitive data marketplace, youā€™ll be well-equipped to navigate the challenges of Industry 4.0 and beyond.

Call to Action

Ready to accelerate your innovation journey? Explore how Actian Zeenea can transform your manufacturing processes and give you a competitive edge.

Learn more about how our advanced data discovery solutions can help you unlock the full potential of your data. Sign up for a live product demo and Q&A.Ā 

Ā 

The post Accelerating Innovation: Data Discovery in Manufacturing appeared first on Actian.


Read More
Author: Kasey Nolan

Mind the Gap: Architecting Santaā€™s List ā€“ The Naughty-Nice Database


You never know whatā€™s going to happen when you click on a LinkedIn job posting button. Iā€™m always on the lookout for interesting and impactful projects, and one in particular caught my attention: ā€œFar North Enterprises, a global fabrication and distribution establishment, is looking to modernize a very old data environment.ā€ I clicked the button [ā€¦]

The post Mind the Gap: Architecting Santaā€™s List ā€“ The Naughty-Nice Database appeared first on DATAVERSITY.


Read More
Author: Mark Cooper

From Silos to Synergy: Data Discovery for Manufacturing

Introduction

There is an urgent reality that many manufacturing leaders are facing, and thatā€™s data silos. Valuable information remains locked within departmental systems, hindering your ability to make strategic, well-informed decisions. A data catalog and enterprise data marketplace solution provides a comprehensive, integrated view of your organizationā€™s data, breaking down silos and enabling true collaboration.Ā 

The Problem: Data Silos Impede Visibility

In your organization, each department maintains its own critical datasetsā€“ finance compiles detailed financial reports, sales leverages CRM data, marketing analyzes campaign performance, and operations tracks supply chain metrics. But hereā€™s the challenge: how confident are you that you even know what data is available, who owns it, or if itā€™s quality?

The issue goes beyond traditional data silos. Itā€™s not just that the data is isolatedā€“ itā€™s that your teams are unaware of what data even exists. This lack of visibility creates a blind spot. Without a clear understanding of your companyā€™s data landscape, you face inefficiencies, inconsistent analysis, and missed opportunities. Departments and up duplicating work, using outdated or unreliable data, and making decisions based on incomplete information.

The absence of a unified approach to data discovery and cataloging means that even if the data is technically accessible, it remains hidden in plain sight, trapped in disparate systems without any context or clarity. Without a comprehensive search engine for your data, your organization will struggle to:

  • Identify data sources: You canā€™t leverage data if you donā€™t know it exists. Without visibility into all available datasets, valuable information often remains unused, limiting your ability to make fully informed decisions.
  • Access data quality: Even when you find the data, how do you know itā€™s accurate and up-to-date? Lack of metadata means you canā€™t evaluate the quality or relevance of the information, leading to analysis based on faulty data.
  • Understand data ownership: when itā€™s unclear who owns or manages specific datasets, you waste time tracking down information and validating its source. This confusion slows down projects and introduces unnecessary friction.Ā 

The Solution

Now, imagine the transformative potential if your team could search for and discover all available data across your organization as easily as using a search engine. Implementing a robust metadata management strategyā€”including data lineage, discovery, and catalogingā€”bridges the gaps between disparate datasets, enabling you to understand what data exists, its quality, and how it can be used. Instead of chasing down reports or sifting through isolated systems, your teams gain an integrated view of your companyā€™s data assets.

  • Data Lineage provides a clear map of how data flows through your systems, from its origin to its current state. It allows you to trace the journey of your data, ensuring you know where it came from, how itā€™s been transformed, and if it can be trusted. This transparency is crucial for verifying data quality and making accurate, data-driven decisions.
  • Data Discovery enables teams to quickly search through your companyā€™s data landscape, finding relevant datasets without needing to know the specific source system. Itā€™s like having a powerful search tool that surfaces all available data, complete with context about its quality and ownership, helping your team unlock valuable insights faster.
  • A Comprehensive Data Catalog serves as a central hub for all your metadata, documenting information about the datasets, their context, quality, and relationships. It acts as a single source of truth, making it easy for any team member to understand what data is available, who owns it, and how it can be used effectively.

Revolutionizing Your Operations With Metadata Management

This approach can transform the way each department operates, fostering a culture of informed decision-making and reducing inefficiencies:

  • Finance gains immediate visibility into relevant sales data, customer demand forecasts, and historical trends, allowing for more accurate budgeting and financial planning. With data lineage, your finance team can verify the source and integrity of financial metrics, ensuring compliance and minimizing risks.
  • Sales can easily search for and access up-to-date product data, customer insights, and market analysis, all without needing to navigate complex systems. A comprehensive data catalog simplifies the process of finding the most relevant datasets, enabling your sales team to tailor their pitches and close deals faster.
  • Marketing benefits from an integrated view of customer behavior, campaign performance, and product success. Using data discovery, your marketing team can identify the most impactful campaigns and refine strategies based on real-time feedback, driving greater engagement and ROI.
  • Supply Chain Leaders can trace inventory data back to its origin, gaining full visibility into shipments, supplier performance, and potential disruptions. With data lineage, they understand the dataā€™s history and quality, allowing for proactive adjustments and optimized procurement.
  • Manufacturing Managers have access to a clear, unified view of production data, demand forecasts, and operational metrics. The data catalog offers a streamlined way to integrate insights from across the company, enabling better decision-making in scheduling, resource allocation, and quality management.
  • Operations gains a comprehensive understanding of the entire production workflow, from raw materials to delivery. Data discovery and lineage provide the necessary context for making quick adjustments, ensuring seamless production and minimizing delays.

This strategy isnā€™t about collecting more dataā€”itā€™s about creating a clearer, more reliable picture of your entire business. By investing in a data catalog, you turn fragmented insights into a cohesive, navigable map that guides your strategic decisions with clarity and confidence. Itā€™s the difference between flying blind and having a comprehensive navigation system that leads you directly to success.

The Benefits: From Fragmentation to Unified Insight

When you prioritize data intelligence with a catalog as a cornerstone, your organization gains access to a powerful suite of benefits:

  1. Enhanced Decision-Making: With a unified view of all data sources, your team can make well-informed decisions based on real-time insights. Data lineage allows you to trace back the origin of key metrics, ensuring the accuracy and reliability of your analysis.
  2. Improved Collaboration Across Teams: With centralized metadata and clear data relationships, every department has access to the same information, reducing silos and fostering a culture of collaboration.
  3. Greater Efficiency and Reduced Redundancies: By eliminating duplicate efforts and streamlining data access, your teams can focus on strategic initiatives rather than time-consuming data searches.
  4. Proactive Risk Management: Full visibility into data flow and origins enables you to identify potential issues before they escalate, minimizing disruptions and maintaining smooth operations.
  5. Increased Compliance and Data Governance: Data lineage provides a transparent trail for auditing purposes, ensuring your organization meets regulatory requirements and maintains data integrity.

Conclusion

Data silos are more than just an operational inconvenienceā€”they are a barrier to your companyā€™s growth and innovation. By embracing data cataloging, lineage, and governance, you empower your teams to collaborate seamlessly, leverage accurate insights, and make strategic decisions with confidence. It is time to break down the barriers, integrate your metadata, and unlock the full potential of your organizationā€™s data.

Call to Action

Are you ready to eliminate data silos and gain a unified view of your operations? Discover the power of metadata management with our comprehensive platform. Visit our website today to learn more and sign up for a live product demo and Q&A.

The post From Silos to Synergy: Data Discovery for Manufacturing appeared first on Actian.


Read More
Author: Kasey Nolan

5 Data Management Tool and Technology Trends to Watch in 2025


The market surrounding data management tools and technologies is quite mature. After all, the typical business has been making extensive use of data to help streamline its operations and decision-making for years, and many companies have long had data management tools in place. But that doesnā€™t mean that little is happening in the world of [ā€¦]

The post 5 Data Management Tool and Technology Trends to Watch in 2025 appeared first on DATAVERSITY.


Read More
Author: Matheus Dellagnelo

How to Foster a Cross-Organizational Approach to Data Initiatives


In todayā€™s business landscape, data reigns supreme. It is the cornerstone of effective decision-making, fuels innovation, and drives organizational success. However, despite its immense potential, many organizations struggle to harness the full power of their data due to a fundamental disconnect between IT and business teams. This division not only impedes progress but also undermines [ā€¦]

The post How to Foster a Cross-Organizational Approach to Data Initiatives appeared first on DATAVERSITY.


Read More
Author: Abhas Ricky

Data Insights Ensure Quality Data and Confident Decisions
Every business (large or small) creates and depends upon data. One hundred years ago, businesses looked to leaders and experts to strategize and to create operational goals. Decisions were based on opinion, guesswork, and a complicated mixture of notes and records reflecting historical results that may or may not be relevant to the future.Ā  Today, [ā€¦]


Read More
Author: Kartik Patel

Securing Your Data With Actian Vector

The need for securing data from unauthorized access is not new. It has been required by laws for handling personally identiļ¬able information (PII) for quite a while. But the increasing use of data services in the cloud for all kinds of proprietary data that is not PII now makes data security an important part of most data strategies.

This is the start of a series of blog posts that take a detailed look at how data security can be ensured with Actian Vector. The first post explains the basic concept of encryption at rest and how Actian Vectorā€™s Database Encryption functionality implements it.

Understanding Encryption at Rest

Encryption at rest refers to encryption of data at rest, which means data that is persisted, usually on disk or in cloud storage. This encryption can be used in a database system that is mainly user data in tables and indexes, but also includes the metadata describing the organization of the user data. The main purpose of encryption at rest is to secure the persisted data from unauthorized direct access on disk or in cloud storage, that is without a connection to the database system.

The encryption can be transparent to the database applications. In this case, encryption and decryption is managed by the administrator, usually at the level of databases. The application then does not need to be aware of the encryption. It connects to the database to access and work with the data as if there is no encryption at all. In Actian Vector, this type of encryption at rest is called database encryption.

Encryption at the application level, on the other hand, requires the application to handle the encryption and decryption. Often this means that the user of the application has to provide an encryption key for both, the encryption (e.g. when data is inserted) and the decryption (e.g. when data is selected). While more complicated, it provides more control to the application and the user.

For example, encryption can be applied more ļ¬ne grained to speciļ¬c tables, columns in tables, or even individual record values in table columns. It may be possible to use individual encryption keys for diļ¬€erent data values. Thus, users can encrypt their private data with their own encryption key and be sure that without having this encryption key, no other user can see the data in clear text. In Actian Vector, encryption at the application level is referred to as function-based encryption.

Using Database Encryption in Actian Vector

In Actian Vector, the encryption that is transparent to the application works at the scope of a database and therefore is called database encryption. Whether a database is encrypted or not is determined with the creation of the database and cannot be changed later. When a database is created with database encryption, all the persisted data in tables and indexes, as well as the metadata for the database, is encrypted.

The encryption method is 256-bit AES, which requires a 32 byte symmetric encryption key. Symmetric means that the same key is used to encrypt and decrypt the data. This key is individually generated for each encrypted database and is called a database (encryption) key.

To have the database key available, it is stored in an internal system ļ¬le of the database server, where it is protected by a passphrase. This passphrase is provided by the user when creating the database. However, the database key is not used to directly encrypt the user data. Instead, it is used to encrypt, i.e. protect, yet another set of encryption keys that in turn are used to encrypt the user data in the tables and indexes. This set of encryption keys is called table (encryption) keys.

Once the database is created, the administrator can use the chosen passphrase to ā€œlockā€ the database. When the database is locked, the encrypted data cannot be accessed. Likewise, the administrator also uses the passphrase to ā€œunlockā€ a locked database and thus re-enable access to the encrypted data. When the database is unlocked, the administrator can change the passphrase. If desired, it is also possible to rotate the database key when changing the passphrase.

The rotation of the database key is optional, because it means that the whole container of the table keys needs to be decrypted with the old database key to then re-encrypt it with the new database key. Because this container of the table keys also contains other metadata, it can be quite large and thus the rotation of the database key can become a slow and computationally expensive operation. Database key rotation therefore is only recommended if there is a reasonable suspicion that the database key was compromised. Most of the time, changing only the passphrase should be suļ¬€icient. And it is done quickly.

With Actian Vector it is also possible to rotate the table encryption keys. This is done independently from changing the passphrase and the database key, and can be performed on a complete database as well as on individual tables. For each key that is rotated, the data must be decrypted with the old key and re-encrypted with the new key. In this case, we are dealing with the user data in tables and indexes. If this data is very large, the key rotation can be very costly and time consuming. This is especially true when rotating all table keys of a database.

A typical workļ¬‚ow of using database encryption in Actian Vector:

  • Create a database with encryption:
      1. createdb -encrypt <database_name>

This command prompts the user twice for the passphrase and then creates the database with encryption. The new database remains unlocked, i.e. it is readily accessible, until it is explicitly locked or until shutdown of the database system.

It is important that the creator of the database remembers the provided passphrase because it is needed to unlock the database and make it accessible, e.g. after a restart of the database system.

  • Lock the encrypted database:
      1. Connect to the unlocked database with the Terminal Monitor:
        sql <database_name>
      2. SQL to lock the database:
        DISABLE PASSPHRASE '<user supplied passphrase>'; g

The SQL statement locks the database. New connect attempts to the database are rejected with a corresponding error. Sessions that connected previously can still access the data until they disconnect.

To make the database lock also immediately eļ¬€ective for already connected sessions, additionally issue the following SQL statement:

      1. CALL X100(TERMINATE); g
  • Unlock the encrypted database:
      1. Connect to the locked database with the Terminal Monitor and option ā€œ-no_x100ā€:
        sql -no_x100 <database_name>
      2. SQL to unlock the database:
        ENABLE PASSPHRASE '<user supplied passphrase>'; g

The connection with the ā€œ-no_x100ā€ option connects without access to the warehouse data, but allows the administrative SQL statement to unlock the database.

  • Change the passphrase for the encrypted database:
      1. Connect to the unlocked database with the Terminal Monitor:
        sql <database_name>
      2. SQL to change the passphrase:
        ALTER PASSPHRASE '<old user supplied passphrase>' TO
        '<new passphrase>'; g

Again, it is important that the administrator remembers the new passphrase.

After changing the passphrase for an encrypted database, it is recommended to perform a new database backup (a.k.a. ā€œdatabase checkpointā€) to ensure continued full database recoverability.

  • When the database is no longer needed, destroy it:
      1. destroydb <database_name>

Note that the passphrase of the encrypted database is not needed to destroy it. The command can only be performed by users with the proper privileges, i.e. the database owner and administrators.

This first blog post in the database security series explained the concept of encryption at rest and how transparent encryption ā€” in Actian Vector called Database Encryption ā€” is used.

The next blog post in this series will take a look at function-based encryption in Actian Vector.

The post Securing Your Data With Actian Vector appeared first on Actian.


Read More
Author: Martin Fuerderer

Synthetic Data Generation: Addressing Data Scarcity and Bias in ML Models


There is no doubt that machine learning (ML) is transforming industries across the board, but its effectiveness depends onĀ the data itā€™s trained on.Ā The ML models traditionally rely on real-world datasets to power the recommendation algorithms, image analysis, chatbots, and other innovative applications that make it so transformative.Ā  However, using actual data creates two significant challenges [ā€¦]

The post Synthetic Data Generation: Addressing Data Scarcity and Bias in ML Models appeared first on DATAVERSITY.


Read More
Author: Anshu Raj

5 Reasons to Invest in a Next-Gen Data Catalog

Organizations across every vertical face numerous challenges managing their data effectively and with full transparency. Thatā€™s at least partially due to data often being siloed across multiple systems or departments, making it difficult for employees to find, trust, and unlock the value of their companyā€™s data assets.

Enter the Actian Zeenea Data Discovery Platform. This data intelligence solution is designed to address data issues by empowering everyone in an organization to easily find and trust the data they need to drive better decision-making, streamline operations, and ensure compliance with regulatory standards.

The Zeenea platform serves as a centralized data catalog and an enterprise data marketplace. By improving data visibility, access, and governance, it provides a scalable and efficient framework for businesses to leverage their data assets. The powerful platform helps organizations explore new and sustainable use cases, including these five:

1. Overcome Data Silo and Complexity Challenges

Data professionals are well familiar with the struggles of working in environments where data is fragmented across departments and systems. This leads to data silos that restrict access to critical information, which ends up creating barriers to fully optimizing data.

Another downside to having barriers to data accessibility is that users spend significant time locating data instead of analyzing it, resulting in inefficiencies across business processes. The Zeenea platform addresses accessibility issues by providing a centralized, searchable repository of all data assets.

The repository is enriched with metadataā€”such as data definitions, ownership, and quality metricsā€”that gives context and meaning to the organizationā€™s data. Both technical and non-technical users can quickly find and understand the data they need, either by searching for specific terms, filtering by criteria, or through personalized recommendations. This allows anyone who needs data to quickly and easily find what they need without requiring IT skills or relying on another team for assistance.

For example, marketing analysts looking for customer segmentation data for a new campaign can quickly locate relevant datasets in the Zeenea platform. Whether analysts know exactly what theyā€™re searching for or are browsing through the data catalog, the platform provides insights into each datasetā€™s source, quality, and usage history.

Based on this information, analysts can decide whether to request access to the actual data or consult the data owner to fix any quality issues. This speeds up the data usage process and ensures that decision-makers have access to the best available data relevant for the campaign.

2. Solve the Issue of Limited Data Access for Business Users

In many organizations, data access is often limited to technical teams such as IT or data engineering. Being dependent on specialty or advanced skills creates bottlenecks because business users must request data from other teams. This reliance on IT or engineering departments leads to delayed insights and increases the workload on technical teams that may already be stretched thin.

The Zeenea platform helps by democratizing data access by enabling non-technical users to explore and ā€œshopā€ for data in a self-service environment. With Zeeneaā€™s Enterprise Data Marketplace, business users can easily discover, request, and use data that has been curated and approved by data governance teams. This self-service model reduces the reliance on IT and data specialists, empowering all employees across the organization to make faster, data-driven decisions.

Barrier-free data access can help all users and departments. For instance, sales managers preparing for a strategy meeting can use the Enterprise Data Marketplace to access customer reports and visualizationsā€”without needing to involve the data engineering team.

By using the Zeenea platform, sales managers can pull data from various departments, such as finance, sales, or marketing, to create a comprehensive view of customer behavior. This allows the managers to identify opportunities for improved engagement as well as cross-sell and upsell opportunities.

3. Gain Visibility Into Data Origins and Compliance Requirements

As organizations strive to meet stringent and regulatory requirements that seem to be constantly changing, having visibility into both data origins and data transformations becomes essential. Understanding how data has been sourced, modified, and managed is crucial for compliance and auditing processes. However, without proper tracking systems, tracing this information accurately can be extremely difficult.

This is another area where the Zeenea platform can help. It provides detailed data lineage tracking, allowing users to trace the entire lifecycle of a dataset. From dataā€™s origin to its transformation and usage, the platform offers a visual map of data flows, making it easier to troubleshoot errors, detect anomalies, and verify the accuracy of reports.

With this capability, organizations can present clear audit trails to demonstrate compliance with regulatory standards. A common use case is in the financial sector. A bank facing a regulatory audit can leverage Zeeneaā€™s data lineage feature to show auditors exactly how financial data has been handled.

By comprehensively tracing each dataset, the bank can easily demonstrate compliance with industry regulations. Plus, having visibility into data reduces the complexity of the audit process and builds trust in data management practices.

4. Provide Ongoing Data Governance

Managing data governance in compliance with internal policies and external regulations is another top priority for organizations. With laws such as GDPR and HIPAA that have strict penalties, companies must ensure that sensitive data is handled securely and data usage is properly tracked.

The Zeenea platform delivers capabilities to meet this challenge head-on. It enables organizations to define and enforce governance rules across their data assets, ensuring that sensitive information is securely managed. Audit trail, access control, and data lineage features help organizations comply with regulatory requirements. These features also play a key role in ensuring data is properly cataloged and monitored.

Organizations in industries like healthcare that handle highly sensitive information can benefit from the Zeenea platform. The platform can help companies, like those in healthcare, manage access controls, encryption, and data monitoring. This ensures compliance with HIPAA and other regulations while safeguarding patient privacy. Additionally, the platform streamlines internal governance practices, ensuring that all data users follow established guidelines for data security.

5. Build a Data-Driven Organization

The Actian Zeenea Data Discovery Platform offers a comprehensive solution to solve modern data management challenges. By improving data discovery, governance, and access, the Zeenea platform removes barriers to data usage, making it easier for organizations to unlock the full value of their data assets.

Whether itā€™s giving business users self-service capabilities, streamlining compliance efforts, or supporting a data mesh approach that decentralizes data management, the platform gives individual departments the ability to manage their own data while maintaining organization-wide visibility. Additionally, the platform provides the tools and infrastructure needed to thrive in todayā€™s data-driven world.

Experience a Live Demo

Organizations looking to improve their data outcomes should consider the Zeenea platform. By creating a single source of truth for data across the enterprise, the solution enables faster insights, smarter decisions, and stronger complianceā€”all key drivers of business success in the digital age. Find out more by joining a live product demo.

The post 5 Reasons to Invest in a Next-Gen Data Catalog appeared first on Actian.


Read More
Author: Dee Radh

Book of the Month: ā€œAI Governance Comprehensiveā€


Welcome to December 2024ā€™s ā€œBook of the Monthā€ column. This month, weā€™re featuring ā€œAI Governance Comprehensive: Tools, Vendors, Controls, and Regulationsā€ by Sunil Soares, available for free download on the YourDataConnect (YDC) website.Ā  This book offers readers a strong foundation in AI governance. While the emergence of generative AI (GenAI) has brought AI governance to [ā€¦]

The post Book of the Month: ā€œAI Governance Comprehensiveā€ appeared first on DATAVERSITY.


Read More
Author: Mark Horseman

Technical and Strategic Best Practices for Building RobustĀ Data Platforms


In the AI era, organizations are eager to harness innovation and create value through high-quality, relevant data. Gartner, however, projects thatĀ 80% of data governance initiatives will fail by 2027. This statistic underscores the urgent need for robustĀ data platformsĀ and governance frameworks. A successful data strategy outlines best practices and establishes a clear vision for data architecture, [ā€¦]

The post Technical and Strategic Best Practices for Building RobustĀ Data Platforms appeared first on DATAVERSITY.


Read More
Author: Alok Abhishek

The Rise of Cloud Repatriation in Storage Solutions


In recent years, a large number of businesses have jumped on the wave and moved their applications and data to the cloud, withĀ nine out of 10 IT professionalsĀ considering the cloud a cornerstone of their digital strategy. While the cloud can offer flexibility, the reality is managing cloud expenses and resulting security liabilities have become significantly [ā€¦]

The post The Rise of Cloud Repatriation in Storage Solutions appeared first on DATAVERSITY.


Read More
Author: Roger Brulotte

RSS
YouTube
LinkedIn
Share