Empowering Organizations Through Data Literacy, Governance, and Business Literacy
Read More
Author: Gopi Maren
Read More
Author: Gopi Maren
Read More
Author: Irfan Gowani
Read More
Author: Melanie Mecca
Read More
Author: Dave McComb
Read More
Author: Ainsley Lawrence
Read More
Author: William A. Tanenbaum and Isaac Greaney
Read More
Author: Kartik Patel
I am very excited about the HCL InformixĀ® 15 external smartblob feature.
If you are not familiar with them, external smartblobs allow the user to store actual Binary Large Object (blob) and Character Large Object (clob) data external to the database. Metadata about that external storage is maintained by the database.
Notes: This article does NOT discuss details of the smartblobs feature itself, but rather proposes a solution to make the functionality more user-friendly. For details on feature behavior, setup, and new functions, see the documentation.
At the writing of this blog, v15.0 does not have the ifx_lo_path function defined, as required below.Ā This has been reported to engineering.Ā The workaround is to create it yourself with the following command:
create dba function ifx_lo_path(blob) Ā Ā returns lvarchar Ā Ā external name '(sq_lo_path)' Ā Ā language C;
This article also does not discuss details of client programming required to INSERT blobs and clobs into the database.
The external smartblob feature was built for two main reasons:
1. Backup size
Storing blobs in the database itself can cause the database to become extremely large. As such, performing backups on the database takes an inordinate amount of time, and 0 level backups can be impossible. Offloading the actual blob contents to an external file system can lessen the HCL Informix backup burden by putting the blob data somewhere else. The database still governs the storage of, and access to, the blob, but the physical blob is housed elsewhere/externally.
2. Easy access to blobs
Users would like easy access to blob data, with familiar tools, without having to go through the database.Ā
Using External Smartblobs in HCL Informix 15
HCL Informix 15 introduces external smartblobs. When you define an external smartblob space, you specify the external directory location (outside the database) where you would like the actual blob data to be stored. Then you assign blob column(s) to that external smartblob space when you CREATE TABLE. When a row is INSERTed, HCL Informix stores the blob data in the defined directory using an internal identifier for the filename.
Hereās an example of a customer forms table: custforms (denormalized and hardcoded for simplicity). My external sbspace directory is /home/informix/blog/resources/esbsp_dir1.
CREATE TABLE custforms(formid SERIAL, company CHAR(20), year INT, lname CHAR(20), formname CHAR(50), form CLOB) PUT form IN (esbsp);
Here, I INSERT a 2023 TaxForm123 document from a Java program for a woman named Sanchez, who works for Actian:
try(PreparedStatement p = c.prepareStatement("INSERT INTO custforms (company, year, lname, formname, form) values(?,?,?,?,?)"); FileInputStream is = new FileInputStream("file.xml")) { p.setString(1, "Actian"); p.setString(2, "2023"); p.setString(3, "Sanchez"); p.setString(4, "TaxForm123"); p.setBinaryStream(5, is); p.executeUpdate(); }
After I INSERT this row, my external directory and file would look like this:
[informix@schma01-rhvm03 resources]$ pwd /home/informix/blog/resources [informix@schma01-rhvm03 resources]$ ls -l esbsp* -rw-rw---- 1 informix informix 10240000 Oct 17 13:22 esbsp_chunk1 esbsp_dir1: total 0 drwxrwx--- 2 informix informix 41 Oct 17 13:19 IFMXSB0 [informix@schma01-rhvm03 resources]$ ls esbsp_dir1/IFMXSB0 LO[2,2,1(0x102),1729188125]
Where LO[2,2,1(0x102),1729188125]is an actual file that contains the data that I could access directly. The problem is that if I want to directly access this file for Ms. Sanchez, I would first have to figure out that this file belongs to her and is the tax document I want. Itās very cryptic!
A User-Friendly Smartblob Solution
When talking to Informix customers, they love the new external smartblobs feature but wish it could be a little more user-friendly.
As in the above example, instead of putting Sanchezās 2023 TaxForm123 into a general directory called IFMXSB0 in a file called LO[2,2,1(0x102),1729188125, which together are meaningless to an end-user, wouldnāt it be nice if the file was located in an intuitive place like /home/forms/Actian/2024/TaxForm123/Sanchez.xml or something similarā¦something meaningfulā¦how YOU want it organized?
Having HCL Informix automatically do this is a little easier said than done, primarily because the database would not intuitively know how any one customer would want to organize their blobs. What exact directory substructure? From what column or columns do I form the file names? What order? All use cases would be different.
Leveraging a User-Friendly Shadow Directory
The following solution shows how you can create your own user-friendly logical locations for your external smartblobs by automatically maintaining a lightweight shadow directory structure to correspond to actual storage locations. The solution uses a very simple system of triggers and stored procedures to do this.
Note: Examples here are shown on Linux, but other UNIX flavors should work also.
How to Set Up in 4 Steps
For each smartblob column in question
STEP 1: Decide how you want to organize access to your files.
Decide what you want the base of your shadow directory to be and create it. In my case for this blog, it is: /home/informix/blog/resources/user-friendly. You could probably implement this solution without a set base directory (as seen in the examples), but that may not be a good idea because users would unknowingly start creating directories everywhere.
STEP 2: Create a create_link stored procedure and corresponding trigger for INSERTs.
This procedure makes sure that the desired data-driven subdirectory structure exists from the base (mkdir -p), then forms a user-friendly logical link to the Informix smartblob file.Ā Ā Ā You must pass all the columns to this procedure from which you want to form the directory structure and filename from the trigger.
CREATE PROCEDURE
CREATE PROCEDURE create_link (p_formid INT, p_company CHAR(20), p_year INT, p_lname CHAR(20), p_formname CHAR(50))
DEFINE v_oscommand CHAR(500); DEFINE v_custlinkname CHAR(500); DEFINE v_ifmxname CHAR(500); DEFINE v_basedir CHAR(100);
-- set the base directory LET v_basedir = '/home/informix/blog/resources/user-friendly';
-- make sure directory tree exists LET v_oscommand = 'mkdir -p ' || TRIM(v_basedir) || '/' || TRIM(p_company) || '/' || TO_CHAR(p_year); SYSTEM v_oscommand; -- form full link name LET v_custlinkname = TRIM(v_basedir) || '/' || TRIM(p_company) || '/' || TO_CHAR(p_year) || '/' || TRIM(p_lname) || '.' || TRIM(p_formname) || '.' || TO_CHAR(p_formid); -- get the actual location SELECT IFX_LO_PATH(form::LVARCHAR) INTO v_ifmxname FROM custforms WHERE formid = p_formid; -- create the os link LET v_oscommand = 'ln -s -f ' || '''' || TRIM(v_ifmxname) || '''' || ' ' || v_custlinkname; SYSTEM v_oscommand; END PROCEDURE
CREATE TRIGGER
CREATE TRIGGER ins_tr INSERT ON custforms REFERENCING new AS post FOR EACH ROW(EXECUTE PROCEDURE create_link (post.formid, post.company, post.year, post.lname, post.formname));
STEP 3: Create a delete_link stored procedure and corresponding trigger for DELETEs.
This procedure will delete the shadow directory link if the row is deleted.
CREATE PROCEDURE
CREATE PROCEDURE delete_link (p_formid INT, p_company CHAR(20), p_year INT, p_lname CHAR(20), p_formname CHAR(50))
DEFINE v_oscommand CHAR(500); DEFINE v_custlinkname CHAR(500); DEFINE v_basedir CHAR(100);
-- set the base directory LET v_basedir = '/home/informix/blog/resources/user-friendly';
-- form full link name LET v_custlinkname = TRIM(v_basedir) || '/' || TRIM(p_company) || '/' || TO_CHAR(p_year) || '/' || TRIM(p_lname) || '.' || TRIM(p_formname) || '.' || TO_CHAR(p_formid);
-- remove the link LET v_oscommand = 'rm -f -d ' || v_custlinkname; SYSTEM v_oscommand; END PROCEDURE
CREATE TRIGGER
CREATE TRIGGER del_tr DELETE ON custforms REFERENCING old AS pre FOR EACH ROW (EXECUTE PROCEDURE delete_link (pre.formid, pre.company, pre.year, pre.lname, pre.formname));
STEP 4: Create a change_link stored procedure and corresponding trigger for UPDATEs, if desired.Ā Ā In my example, Ms. Sanchez might get married to Mr. Simon and an UPDATE to her last name in the database occurs. I may then want to change all my user-friendly names from Sanchez to Simon.Ā This procedure deletes the old link and creates a new one.
Notice the update trigger only must fire on the columns that form your directory structure and filenames.
CREATE PROCEDURE
CREATE PROCEDURE change_link (p_formid INT, p_pre_company CHAR(20), p_pre_year INT, p_pre_lname CHAR(20), p_pre_formname CHAR(50), p_post_company CHAR(20), p_post_year INT, p_post_lname CHAR(20), p_post_formname CHAR(50)) DEFINE v_oscommand CHAR(500); DEFINE v_custlinkname CHAR(500); DEFINE v_ifmxname CHAR(500); DEFINE v_basedir CHAR(100); -- set the base directory LET v_basedir = '/home/informix/blog/resources/user-friendly'; -- get rid of old -- form old full link name LET v_custlinkname = TRIM(v_basedir) || '/' || TRIM(p_pre_company) || '/' || TO_CHAR(p_pre_year) || '/' || TRIM(p_pre_lname) || '.' || TRIM(p_pre_formname) || '.' || TO_CHAR(p_formid) ; -- remove the link and empty directories LET v_oscommand = 'rm -f -d ' || v_custlinkname; SYSTEM v_oscommand; -- form the new -- make sure directory tree exists LET v_oscommand = 'mkdir -p ' || TRIM(v_basedir) || '/' || TRIM(p_post_company) || '/' || TO_CHAR(p_post_year); SYSTEM v_oscommand; -- form full link name LET v_custlinkname = TRIM(v_basedir) || '/' || TRIM(p_post_company) || '/' || TO_CHAR(p_post_year) || '/' || TRIM(p_post_lname) || '.' || TRIM(p_post_formname) || '.' || TO_CHAR(p_formid) ; -- get the actual location -- this is the same as before as id has not changed SELECT IFX_LO_PATH(form::LVARCHAR) INTO v_ifmxname FROM custforms WHERE formid = p_formid; -- create the os link LET v_oscommand = 'ln -s -f ' || '''' || TRIM(v_ifmxname) || '''' || ' ' || v_custlinkname; SYSTEM v_oscommand; END PROCEDURE
CREATE TRIGGER
CREATE TRIGGER upd_tr UPDATE OF formid, company, year, lname, formname ON custforms REFERENCING OLD AS pre NEW as post FOR EACH ROW(EXECUTE PROCEDURE change_link (pre.formid, pre.company, pre.year, pre.lname, pre.formname, post.company, post.year, post.lname, post.formname));
Results Example
Back to our example.
With this infrastructure in place, now in addition to the Informix-named file being in place, I would have these user-friendly links on my file system that I can easily locate and identify.
INSERT
[informix@schma01-rhvm03 2023]$ pwd
/home/informix/blog/resources/user-friendly/Actian/2023 [informix@schma01-rhvm03 2023]
$ ls Sanchez.TaxForm123.2
If I do an ls -l, you will see that it is a link to the Informix blob file.
[informix@schma01-rhvm03 2023]$ ls -l total 0 lrwxrwxrwx 1 informix informix 76 Oct 17 14:20 Sanchez.TaxForm123.2 -> /home/informix/blog/resources/esbsp_dir1/IFMXSB0/LO[2,2,1(0x102),1729188126]
UPDATE
If I then update her last name with UPDATE custforms SET lname = āSimonā where formid=2,my file system now looks like this:
[informix@schma01-rhvm03 2023]$ ls -l lrwxrwxrwx 1 informix informix 76 Oct 17 14:25 Simon.TaxForm123.2 -> /home/informix/blog/resources/esbsp_dir1/IFMXSB0/LO[2,2,1(0x102),1729188126]
DELETE
If I then go and DELETE this form with DELETE FROM custforms where formid=2, my directory structure looks like this:
[informix@schma01-rhvm03 2023]$ pwd /home/informix/blog/resources/user-friendly/Actian/2023 [informix@schma01-rhvm03 2023]$ ls [informix@schma01-rhvm03 2023]$
We Welcome Your Feedback
Please enjoy the new HCL Informix15 external smartblob feature.
I hope this idea can make external smartblobs easier for you to use. If you have any feedback on the idea, especially on enhancements or experience in production, please feel free to contact me at mary.schulte@hcl-software.com. I look forward to hearing from you!
Find out more about the launch of HCL Informix 15.
Notes
1. Shadow directory permissions. In creating this example, I did not explore directory and file permissions, but rather just used general permissions settings on my sandbox server. Likely, you will want to control permissions to avoid some of the anomalies I discuss below.
2. Manual blob file delete. With external smartblobs, if permissions are not controlled, it is possible that a user might somehow delete the physical smartblob file itself from its directory. HCL Informix, itself, cannot control this from happening. In the event it does happen, HCL Informix does NOT delete the corresponding row; the blob file will just be missing. There may be aspects to links that can automatically handle this, but I have not investigated them for this blog.
3. Link deletion in the shadow directory. If permissions are not controlled, it is possible that a user might delete a logical link formed by this infrastructure. This solution does not detect this. If this is an issue, I would suggest a periodic maintenance job that cross references the shadow directory links to blob files to detect missing links. For those blobs with missing links, write a database program to look up the rowās location with the IFX_LO_PATH function, and reform the missing link.
4. Unique identifiers. I highly recommend using unique identifiers in this solution. In this simple example, I used formid. You donāt want to clutter things up, of course, but depending on how you structure your shadow directories and filenames, you may need to include more unique identifiers to avoid directory and link names duplication.
5. Empty directories. I did not investigate if there are options to rm in the delete stored procedure to clean up empty directories that might remain if a last item is deleted.
6. Production overhead. It is known that excessive triggers and stored procedures can add overhead to a production environment. For this blog, it is assumed that OLTP activity on blobs is not excessive, therefore production overhead should not be an issue. This being said, this solution has NOT been tested at scale.
7. NULL values. Make sure to consider the presence and impact of NULL values in columns used in this solution. For simplicity, I did not handle them here.
Informix is a trademark of IBM Corporation in at least one jurisdiction and is used under license.
Ā
The post User-Friendly External Smartblobs Using a Shadow Directory appeared first on Actian.
Read More
Author: Mary Schulte
Predictions are funny things. They often seem like a bold gamble, almost like trying to peer into the future with the confidence we inherently lack as humans. Technologyās rapid advancement surprises even the most seasoned experts, especially when it progresses exponentially, as it often does. As physicist Albert A. Bartlett famously said, āThe greatest shortcoming [ā¦]
The post AI Predictions for 2025: Embracing the Future of Human and Machine Collaboration appeared first on DATAVERSITY.
Read More
Author: Philip Miller
Unlocking the value of data is a key focus for business leaders, especially the CIO. While in its simplest form, data can lead to better insights and decision-making, companies are pursuing an entirely different and more advanced agenda: the holy grail of data monetization. This concept involves aggregating a variety of both structured and unstructured [ā¦]
The post Data Monetization: The Holy Grail or the Road to Ruin? appeared first on DATAVERSITY.
Read More
Author: Tony Klimas
Brands, publishers, MarTech vendors, and beyond recently gathered in NYC for Advertising Week and swapped ideas on the future of marketing and advertising.Ā The overarching message from many brands was one weāve heard before: First-party data is like gold, especially for personalization. But it takes more than āowningā the data to make it valuable.Ā Scale and accuracy [ā¦]
The post Beyond Ownership: Scaling AI with Optimized First-Party Data appeared first on DATAVERSITY.
Read More
Author: Tara DeZao
The manufacturing industry is in the midst of a digital revolution. Youāve probably heard these buzzwords: Industry 4.0, IoT, AI, and machine learningā all terms that promise to revolutionize everything from assembly lines to customer service. Embracing this digital transformation is key in improving your competitive advantage, but new technology doesnāt come without its own challenges. Each new piece of technology needs one thing to deliver innovation: data.
Data is the fuel powering your tech engines. Without the ability to understand where your data is, whether itās trustworthy, or who owns the datasets, even the most powerful tools can overcomplicate and confuse the best data teams. Thatās where modern data discovery solutions come in. Theyāre like the backstage crew making sure everything runs smoothlyā connecting systems, tidying up the data mess, and making sure everyone has exactly what they need, when they need it. That means faster insights, streamlined operations, and a lower total cost of ownership (TCO). In other words, data access is the key to staying ahead in todayās fast-paced, highly competitive, increasingly sensitive manufacturing market.Ā
Data from all aspects of your business is siloedā whether itās coming from sensors, legacy systems, cloud applications, suppliers or customersā trying to piece it all together is daunting, time-consuming, and just plain hard. Traditional methods are slow, cumbersome, and definitely not built for todayās needs. This fragmented approach not only slows down decision-making, but keeps you from tapping into valuable insights that could drive innovation. And in a market where speed is everything, thatās a recipe for falling behind.Ā
So the big question is: how can you unlock the true potential of your data?
So how do you make data intelligence into a streamlined, efficient process? The answer lies in modern data discovery solutionsā the unsung catalyst of a digital transformation motion. Rather than simply integrating data sources, data discovery solutions excel in metadata management, offering complete visibility into your companyās data ecosystem. They enable usersā regardless of skill levelā to locate where data resides and assess the quality and relevance of the information. By providing this detailed understanding of data context and lineage, organizations can confidently leverage accurate, trustworthy datasets, paving the way for informed decision-making and innovation,Ā
Ā One of the biggest hurdles in data integration is connecting to a variety of data sources, including legacy systems, cloud applications, and IoT devices. Modern data discovery tools like Zeenea offer easy connectivity, allowing you to extract metadata from various sources seamlessly. This unified view eliminates silos and enables faster, more informed decision-making across the organization.
Metadata is the backbone of effective data discovery. Advanced metadata management capabilities ensure that data is well-organized, tagged, and easily searchable. This provides a clear context for data assets, helping you understand the origin, quality, and relevance of your data. This means better data search and discoverability.
A data discovery knowledge graph serves as an intelligent map of your metadata, illustrating the intricate relationship and connections between data assets. It provides users with a comprehensive view of how data points are linked across systems, offering a clear picture of data lineageā from origin to current state. The visibility into the data journey is invaluable in manufacturing, where understanding the flow of information between production data, supply chain metrics, and customer feedback is critical. By tracing the lineage of data, you can quickly assess its accuracy, relevance, and context, leading to more precise insights and informed decision-making.
A data marketplace provides a centralized hub where you can easily search, discover, and access high-quality data. This self-service model empowers your teams to find the information they need without relying on IT, accelerating time to insight. The result? Faster product development cycles, improved process efficiency, and enhanced decision-making capabilities.
Modern data discovery platforms prioritize user experience with intuitive, user-friendly interfaces. Features like natural language search allow users to query data using everyday language, making it easier for non-technical users to find what they need. This democratizes access to data across the organization, fostering a culture of data-driven decision-making.
Traditional metadata management solutions often come with a hefty price tag due to high infrastructure costs and ongoing maintenance. In contrast, modern data discovery tools are designed to minimize TCO with automated features, cloud-based deployment, and reduced need for manual intervention. This means more efficient operations and a greater return on investment.
By leveraging a comprehensive data discovery solution, manufacturers can achieve several key benefits:
With quick access to quality data, teams can identify trends and insights that drive product development and process optimization.
Automated implementation and seamless data connectivity reduce the time required to gather and analyze data, enabling faster decision-making.
Advanced metadata management and knowledge graphs help streamline data governance, ensuring that users have access to reliable, high-quality data.
A user-friendly data marketplace democratizes data access, empowering teams to make data-driven decisions and stay ahead of industry trends.
With low TCO and reduced dependency on manual processes, manufacturers can maximize their resources and allocate budgets towards strategic initiatives.
Data is more than just a resourceāitās a catalyst for innovation. By embracing advanced metadata management and data discovery solutions, you can find, trust, and access data. This not only accelerates time to market but also drives operational efficiency and boosts competitiveness. With powerful features like API-led automation, a data discovery knowledge graph, and an intuitive data marketplace, youāll be well-equipped to navigate the challenges of Industry 4.0 and beyond.
Ready to accelerate your innovation journey? Explore how Actian Zeenea can transform your manufacturing processes and give you a competitive edge.
Learn more about how our advanced data discovery solutions can help you unlock the full potential of your data. Sign up for a live product demo and Q&A.Ā
Ā
The post Accelerating Innovation: Data Discovery in Manufacturing appeared first on Actian.
Read More
Author: Kasey Nolan
You never know whatās going to happen when you click on a LinkedIn job posting button. Iām always on the lookout for interesting and impactful projects, and one in particular caught my attention: āFar North Enterprises, a global fabrication and distribution establishment, is looking to modernize a very old data environment.ā I clicked the button [ā¦]
The post Mind the Gap: Architecting Santaās List ā The Naughty-Nice Database appeared first on DATAVERSITY.
Read More
Author: Mark Cooper
There is an urgent reality that many manufacturing leaders are facing, and thatās data silos. Valuable information remains locked within departmental systems, hindering your ability to make strategic, well-informed decisions. A data catalog and enterprise data marketplace solution provides a comprehensive, integrated view of your organizationās data, breaking down silos and enabling true collaboration.Ā
In your organization, each department maintains its own critical datasetsā finance compiles detailed financial reports, sales leverages CRM data, marketing analyzes campaign performance, and operations tracks supply chain metrics. But hereās the challenge: how confident are you that you even know what data is available, who owns it, or if itās quality?
The issue goes beyond traditional data silos. Itās not just that the data is isolatedā itās that your teams are unaware of what data even exists. This lack of visibility creates a blind spot. Without a clear understanding of your companyās data landscape, you face inefficiencies, inconsistent analysis, and missed opportunities. Departments and up duplicating work, using outdated or unreliable data, and making decisions based on incomplete information.
The absence of a unified approach to data discovery and cataloging means that even if the data is technically accessible, it remains hidden in plain sight, trapped in disparate systems without any context or clarity. Without a comprehensive search engine for your data, your organization will struggle to:
Now, imagine the transformative potential if your team could search for and discover all available data across your organization as easily as using a search engine. Implementing a robust metadata management strategyāincluding data lineage, discovery, and catalogingābridges the gaps between disparate datasets, enabling you to understand what data exists, its quality, and how it can be used. Instead of chasing down reports or sifting through isolated systems, your teams gain an integrated view of your companyās data assets.
This approach can transform the way each department operates, fostering a culture of informed decision-making and reducing inefficiencies:
This strategy isnāt about collecting more dataāitās about creating a clearer, more reliable picture of your entire business. By investing in a data catalog, you turn fragmented insights into a cohesive, navigable map that guides your strategic decisions with clarity and confidence. Itās the difference between flying blind and having a comprehensive navigation system that leads you directly to success.
When you prioritize data intelligence with a catalog as a cornerstone, your organization gains access to a powerful suite of benefits:
Data silos are more than just an operational inconvenienceāthey are a barrier to your companyās growth and innovation. By embracing data cataloging, lineage, and governance, you empower your teams to collaborate seamlessly, leverage accurate insights, and make strategic decisions with confidence. It is time to break down the barriers, integrate your metadata, and unlock the full potential of your organizationās data.
Are you ready to eliminate data silos and gain a unified view of your operations? Discover the power of metadata management with our comprehensive platform. Visit our website today to learn more and sign up for a live product demo and Q&A.
The post From Silos to Synergy: Data Discovery for Manufacturing appeared first on Actian.
Read More
Author: Kasey Nolan
The market surrounding data management tools and technologies is quite mature. After all, the typical business has been making extensive use of data to help streamline its operations and decision-making for years, and many companies have long had data management tools in place. But that doesnāt mean that little is happening in the world of [ā¦]
The post 5 Data Management Tool and Technology Trends to Watch in 2025 appeared first on DATAVERSITY.
Read More
Author: Matheus Dellagnelo
In todayās business landscape, data reigns supreme. It is the cornerstone of effective decision-making, fuels innovation, and drives organizational success. However, despite its immense potential, many organizations struggle to harness the full power of their data due to a fundamental disconnect between IT and business teams. This division not only impedes progress but also undermines [ā¦]
The post How to Foster a Cross-Organizational Approach to Data Initiatives appeared first on DATAVERSITY.
Read More
Author: Abhas Ricky
Read More
Author: Daragh O Brien
Read More
Author: Kartik Patel
The need for securing data from unauthorized access is not new. It has been required by laws for handling personally identiļ¬able information (PII) for quite a while. But the increasing use of data services in the cloud for all kinds of proprietary data that is not PII now makes data security an important part of most data strategies.
This is the start of a series of blog posts that take a detailed look at how data security can be ensured with Actian Vector. The first post explains the basic concept of encryption at rest and how Actian Vectorās Database Encryption functionality implements it.
Encryption at rest refers to encryption of data at rest, which means data that is persisted, usually on disk or in cloud storage. This encryption can be used in a database system that is mainly user data in tables and indexes, but also includes the metadata describing the organization of the user data. The main purpose of encryption at rest is to secure the persisted data from unauthorized direct access on disk or in cloud storage, that is without a connection to the database system.
The encryption can be transparent to the database applications. In this case, encryption and decryption is managed by the administrator, usually at the level of databases. The application then does not need to be aware of the encryption. It connects to the database to access and work with the data as if there is no encryption at all. In Actian Vector, this type of encryption at rest is called database encryption.
Encryption at the application level, on the other hand, requires the application to handle the encryption and decryption. Often this means that the user of the application has to provide an encryption key for both, the encryption (e.g. when data is inserted) and the decryption (e.g. when data is selected). While more complicated, it provides more control to the application and the user.
For example, encryption can be applied more ļ¬ne grained to speciļ¬c tables, columns in tables, or even individual record values in table columns. It may be possible to use individual encryption keys for diļ¬erent data values. Thus, users can encrypt their private data with their own encryption key and be sure that without having this encryption key, no other user can see the data in clear text. In Actian Vector, encryption at the application level is referred to as function-based encryption.
In Actian Vector, the encryption that is transparent to the application works at the scope of a database and therefore is called database encryption. Whether a database is encrypted or not is determined with the creation of the database and cannot be changed later. When a database is created with database encryption, all the persisted data in tables and indexes, as well as the metadata for the database, is encrypted.
The encryption method is 256-bit AES, which requires a 32 byte symmetric encryption key. Symmetric means that the same key is used to encrypt and decrypt the data. This key is individually generated for each encrypted database and is called a database (encryption) key.
To have the database key available, it is stored in an internal system ļ¬le of the database server, where it is protected by a passphrase. This passphrase is provided by the user when creating the database. However, the database key is not used to directly encrypt the user data. Instead, it is used to encrypt, i.e. protect, yet another set of encryption keys that in turn are used to encrypt the user data in the tables and indexes. This set of encryption keys is called table (encryption) keys.
Once the database is created, the administrator can use the chosen passphrase to ālockā the database. When the database is locked, the encrypted data cannot be accessed. Likewise, the administrator also uses the passphrase to āunlockā a locked database and thus re-enable access to the encrypted data. When the database is unlocked, the administrator can change the passphrase. If desired, it is also possible to rotate the database key when changing the passphrase.
The rotation of the database key is optional, because it means that the whole container of the table keys needs to be decrypted with the old database key to then re-encrypt it with the new database key. Because this container of the table keys also contains other metadata, it can be quite large and thus the rotation of the database key can become a slow and computationally expensive operation. Database key rotation therefore is only recommended if there is a reasonable suspicion that the database key was compromised. Most of the time, changing only the passphrase should be suļ¬icient. And it is done quickly.
With Actian Vector it is also possible to rotate the table encryption keys. This is done independently from changing the passphrase and the database key, and can be performed on a complete database as well as on individual tables. For each key that is rotated, the data must be decrypted with the old key and re-encrypted with the new key. In this case, we are dealing with the user data in tables and indexes. If this data is very large, the key rotation can be very costly and time consuming. This is especially true when rotating all table keys of a database.
createdb -encrypt <database_name>
This command prompts the user twice for the passphrase and then creates the database with encryption. The new database remains unlocked, i.e. it is readily accessible, until it is explicitly locked or until shutdown of the database system.
It is important that the creator of the database remembers the provided passphrase because it is needed to unlock the database and make it accessible, e.g. after a restart of the database system.
sql <database_name>
DISABLE PASSPHRASE '<user supplied passphrase>'; g
The SQL statement locks the database. New connect attempts to the database are rejected with a corresponding error. Sessions that connected previously can still access the data until they disconnect.
To make the database lock also immediately eļ¬ective for already connected sessions, additionally issue the following SQL statement:
CALL X100(TERMINATE); g
sql -no_x100 <database_name>
ENABLE PASSPHRASE '<user supplied passphrase>'; g
The connection with the ā-no_x100ā option connects without access to the warehouse data, but allows the administrative SQL statement to unlock the database.
sql <database_name>
ALTER PASSPHRASE '<old user supplied passphrase>' TO '<new passphrase>'; g
Again, it is important that the administrator remembers the new passphrase.
After changing the passphrase for an encrypted database, it is recommended to perform a new database backup (a.k.a. ādatabase checkpointā) to ensure continued full database recoverability.
destroydb <database_name>
Note that the passphrase of the encrypted database is not needed to destroy it. The command can only be performed by users with the proper privileges, i.e. the database owner and administrators.
This first blog post in the database security series explained the concept of encryption at rest and how transparent encryption ā in Actian Vector called Database Encryption ā is used.
The next blog post in this series will take a look at function-based encryption in Actian Vector.
The post Securing Your Data With Actian Vector appeared first on Actian.
Read More
Author: Martin Fuerderer
There is no doubt that machine learning (ML) is transforming industries across the board, but its effectiveness depends onĀ the data itās trained on.Ā The ML models traditionally rely on real-world datasets to power the recommendation algorithms, image analysis, chatbots, and other innovative applications that make it so transformative.Ā However, using actual data creates two significant challenges [ā¦]
The post Synthetic Data Generation: Addressing Data Scarcity and Bias in ML Models appeared first on DATAVERSITY.
Read More
Author: Anshu Raj
Organizations across every vertical face numerous challenges managing their data effectively and with full transparency. Thatās at least partially due to data often being siloed across multiple systems or departments, making it difficult for employees to find, trust, and unlock the value of their companyās data assets.
Enter the Actian Zeenea Data Discovery Platform. This data intelligence solution is designed to address data issues by empowering everyone in an organization to easily find and trust the data they need to drive better decision-making, streamline operations, and ensure compliance with regulatory standards.
The Zeenea platform serves as a centralized data catalog and an enterprise data marketplace. By improving data visibility, access, and governance, it provides a scalable and efficient framework for businesses to leverage their data assets. The powerful platform helps organizations explore new and sustainable use cases, including these five:
Data professionals are well familiar with the struggles of working in environments where data is fragmented across departments and systems. This leads to data silos that restrict access to critical information, which ends up creating barriers to fully optimizing data.
Another downside to having barriers to data accessibility is that users spend significant time locating data instead of analyzing it, resulting in inefficiencies across business processes. The Zeenea platform addresses accessibility issues by providing a centralized, searchable repository of all data assets.
The repository is enriched with metadataāsuch as data definitions, ownership, and quality metricsāthat gives context and meaning to the organizationās data. Both technical and non-technical users can quickly find and understand the data they need, either by searching for specific terms, filtering by criteria, or through personalized recommendations. This allows anyone who needs data to quickly and easily find what they need without requiring IT skills or relying on another team for assistance.
For example, marketing analysts looking for customer segmentation data for a new campaign can quickly locate relevant datasets in the Zeenea platform. Whether analysts know exactly what theyāre searching for or are browsing through the data catalog, the platform provides insights into each datasetās source, quality, and usage history.
Based on this information, analysts can decide whether to request access to the actual data or consult the data owner to fix any quality issues. This speeds up the data usage process and ensures that decision-makers have access to the best available data relevant for the campaign.
In many organizations, data access is often limited to technical teams such as IT or data engineering. Being dependent on specialty or advanced skills creates bottlenecks because business users must request data from other teams. This reliance on IT or engineering departments leads to delayed insights and increases the workload on technical teams that may already be stretched thin.
The Zeenea platform helps by democratizing data access by enabling non-technical users to explore and āshopā for data in a self-service environment. With Zeeneaās Enterprise Data Marketplace, business users can easily discover, request, and use data that has been curated and approved by data governance teams. This self-service model reduces the reliance on IT and data specialists, empowering all employees across the organization to make faster, data-driven decisions.
Barrier-free data access can help all users and departments. For instance, sales managers preparing for a strategy meeting can use the Enterprise Data Marketplace to access customer reports and visualizationsāwithout needing to involve the data engineering team.
By using the Zeenea platform, sales managers can pull data from various departments, such as finance, sales, or marketing, to create a comprehensive view of customer behavior. This allows the managers to identify opportunities for improved engagement as well as cross-sell and upsell opportunities.
As organizations strive to meet stringent and regulatory requirements that seem to be constantly changing, having visibility into both data origins and data transformations becomes essential. Understanding how data has been sourced, modified, and managed is crucial for compliance and auditing processes. However, without proper tracking systems, tracing this information accurately can be extremely difficult.
This is another area where the Zeenea platform can help. It provides detailed data lineage tracking, allowing users to trace the entire lifecycle of a dataset. From dataās origin to its transformation and usage, the platform offers a visual map of data flows, making it easier to troubleshoot errors, detect anomalies, and verify the accuracy of reports.
With this capability, organizations can present clear audit trails to demonstrate compliance with regulatory standards. A common use case is in the financial sector. A bank facing a regulatory audit can leverage Zeeneaās data lineage feature to show auditors exactly how financial data has been handled.
By comprehensively tracing each dataset, the bank can easily demonstrate compliance with industry regulations. Plus, having visibility into data reduces the complexity of the audit process and builds trust in data management practices.
Managing data governance in compliance with internal policies and external regulations is another top priority for organizations. With laws such as GDPR and HIPAA that have strict penalties, companies must ensure that sensitive data is handled securely and data usage is properly tracked.
The Zeenea platform delivers capabilities to meet this challenge head-on. It enables organizations to define and enforce governance rules across their data assets, ensuring that sensitive information is securely managed. Audit trail, access control, and data lineage features help organizations comply with regulatory requirements. These features also play a key role in ensuring data is properly cataloged and monitored.
Organizations in industries like healthcare that handle highly sensitive information can benefit from the Zeenea platform. The platform can help companies, like those in healthcare, manage access controls, encryption, and data monitoring. This ensures compliance with HIPAA and other regulations while safeguarding patient privacy. Additionally, the platform streamlines internal governance practices, ensuring that all data users follow established guidelines for data security.
The Actian Zeenea Data Discovery Platform offers a comprehensive solution to solve modern data management challenges. By improving data discovery, governance, and access, the Zeenea platform removes barriers to data usage, making it easier for organizations to unlock the full value of their data assets.
Whether itās giving business users self-service capabilities, streamlining compliance efforts, or supporting a data mesh approach that decentralizes data management, the platform gives individual departments the ability to manage their own data while maintaining organization-wide visibility. Additionally, the platform provides the tools and infrastructure needed to thrive in todayās data-driven world.
Organizations looking to improve their data outcomes should consider the Zeenea platform. By creating a single source of truth for data across the enterprise, the solution enables faster insights, smarter decisions, and stronger complianceāall key drivers of business success in the digital age. Find out more by joining a live product demo.
The post 5 Reasons to Invest in a Next-Gen Data Catalog appeared first on Actian.
Read More
Author: Dee Radh
Welcome to December 2024ās āBook of the Monthā column. This month, weāre featuring āAI Governance Comprehensive: Tools, Vendors, Controls, and Regulationsā by Sunil Soares, available for free download on the YourDataConnect (YDC) website.Ā This book offers readers a strong foundation in AI governance. While the emergence of generative AI (GenAI) has brought AI governance to [ā¦]
The post Book of the Month: āAI Governance Comprehensiveā appeared first on DATAVERSITY.
Read More
Author: Mark Horseman
In the AI era, organizations are eager to harness innovation and create value through high-quality, relevant data. Gartner, however, projects thatĀ 80% of data governance initiatives will fail by 2027. This statistic underscores the urgent need for robustĀ data platformsĀ and governance frameworks. A successful data strategy outlines best practices and establishes a clear vision for data architecture, [ā¦]
The post Technical and Strategic Best Practices for Building RobustĀ Data Platforms appeared first on DATAVERSITY.
Read More
Author: Alok Abhishek
Imagine a world where your data not only tells a story but also anticipates your next move ā this is the promise of effective data management in the AI era. As organizations try to deal with vast amounts of information, three key components have emerged as essential for unlocking the full potential of data: metadata, [ā¦]
The post Essential Components for Effective Data Management in the AI Era appeared first on DATAVERSITY.
Read More
Author: Joel Christner
In recent years, a large number of businesses have jumped on the wave and moved their applications and data to the cloud, withĀ nine out of 10 IT professionalsĀ considering the cloud a cornerstone of their digital strategy. While the cloud can offer flexibility, the reality is managing cloud expenses and resulting security liabilities have become significantly [ā¦]
The post The Rise of Cloud Repatriation in Storage Solutions appeared first on DATAVERSITY.
Read More
Author: Roger Brulotte