Data preparation makes it possible to extract accurate insights from data analysis for business use. Developing those insights requires connecting multiple data sources and ensuring the data is clean and consistent prior to analysis.
There are three typical data preparation challenges most analysts must overcome:
- Handling large volumes of data – Some types of analysis require handling enormous amounts of data from multiple sources. To manage the sheer amount of data, a strong IT infrastructure is necessary to efficiently deliver analysis results.
- Visual representation of data – Insights gleaned from data in its raw form aren’t always easy to understand. Visuals offer a simple method to make data readable and understandable. Graphs, tables and charts are helpful tools for sharing insights.
- Scalability of the application – Data volumes don’t remain static. Additionally, the time it takes for data to travel from the database to the front-end can impact performance and scalability issues. Maintaining overall data application architecture and technology is key to reducing performance and scalability issues.
Business data analysts can pre-emptively alleviate many data preparation challenges with Quest® Toad® Data Point.
Handling large data volumes in less time
Business analysts generally need a subset of the data. As with most computing environments, the smaller the amount of data to process, the faster the result set is returned. Proper data preparation is necessary to optimize the data set.
I have been to many sites where they have tens of millions of rows online and available for an entire year. There are database technologies that assist with the search and processing of such substantial amounts (I'm referring to partitioning here) but many of these technologies are created using a certain key value, such as date, and if the end user is not careful with their query request, they could be accessing the entire database rather than just a subset of the rows desired.
Within Toad Data Point, a business analyst or end user can create specific subsets of data themselves, and either store it on their local workstation (local storage) or share it for the entire department to use.
There are two types of data subsetting that come to mind when I think of Toad Data Point:
- Using Toad Data Point refreshable snapshot
- Using an automation script
If you are going to share this data with others in the department, this is a perfect scenario for Toad Intelligence Central which allows you to store and share almost anything, but especially data. Either the refreshable snapshot or the automation script can run unattended in off hours and pull the data needed for the reporting time frame. Depending on your needs and amount of physical computer storage, you can save the data subset for future use and manual data preparation time.
If the data is being subsetted just for your needs, you can store the output in your local storage. You can automate this subset of data using an automation script that you would initiate.
Click here to see a related blog post on Data Automation.
Use a Toad Data Point Refreshable Snapshot during data preparation
Saving the data as a refreshable snapshot is easy. Run your query and on the data grid, right click and select 'Send To' then select 'Publish Data'.
Notice you can save the data to local storage as well, but local storage does not support the automatic updating of the data as snapshots do.
Toad Data Point Refreshable Snapshot Panel
The Publish Type should be set to Snapshot. Give the snapshot an appropriate name. I used 'EMP_SNAPSHOT' for this example. Click the Scheduling link.
Toad Data Point Refreshable Snapshot Scheduling Panel
On this page, fill out the schedule for the refresh to occur. You can change this event anytime using the Toad Intelligence Central web page.
Click Apply to accept this change, then click Publish to initiate the creation of the snapshot.
Navigation for the Toad Data Point Refreshable Snapshot Scheduling Panel
Notice you have the EMP_SNAPSHOT object in your Toad Intelligence Central environment. You can now include this data in your work as you would any other data object.
Please note this is a simplistic view of Toad Intelligence Central. With a little forethought, you can easily organize your central repository with folders and allow various levels of access to the people using the repository for data preparation.
Visual representation of data
For enhanced data preparation, Toad Data Point has both a full featured report writer and useful graphics. I tend to use the graphics within the report writer but you can choose what is best for your application. The report writer is easy to use and lends itself well to automation. Click here to view my blog on TDP Data Automation where I show how to use this report writer for passing data from the automation script into the report for specific reporting needs (such as customized form letters with the recipient’s name, data and graphics).
Toad Data Point Report Wizard for data preparation
To access the report writer, right click on any data grid and select 'Send To' then select 'Data Report Designer.'
Welcome screen for Toad Data Point Report Wizard
The Report Wizard allows you to easily format your data into general type reports or mailing labels.
Column selection in Toad Data Point Report Wizard
This panel allows you to select any or all columns from the data set being passed to the report wizard.
Grouping levels in Toad Data Point Report Wizard
You can do “group by type” reports. For this example, I will group by department.
Data preparation summary options in Toad Data Point Report Wizard
This panel allows for some math to occur, again, easing both your learning curve and the need to go into the report designer and make these changes.
Report layout options in Toad Data Point Report Wizard
This panel gives you some control over the basic layout of the data.
Report layout design in Toad Data Point
This is the report designer. This tool is much like most report writers I have used.
This product is called Fast Reports. Documentation is available for this entire tool by googling 'Fast Reports'.
Report output in Toad Data Point
This wizard is a good place to start as it lays out the data in several formats, you can add charts, graphs, gauges, etc. You can pass variables to this report and run from within an automation script. This automation script can then email out the report!
Application should be scalable
Toad Data Point is a single, installable client-side application for data preparation. This useful tool takes advantage of the existing computing environment.
If your workstations can utilize solid state disk (SSD) technology for enhanced speed, so can the location where Toad Intelligence Central maintains its data repository. This technology will improve throughput to Toad Data Point, both with end users using the tool real time and for any Toad Data Point automation scripts.
Try Toad Data Point for free
Learn how Toad Data Point can help you access and prepare data faster. Seamlessly access more than 50 data sources—both on premises and in the cloud—and switch between these data source with near zero transition times.
Get started with our free 30-day trial.
Closing comments on data preparation
Toad Data Point is perfect for your data discovery and data preparation needs. Toad Data Point can subset the data and automate this task so that the data is both qualified and quantified. Toad Data Point also has a full-featured report writer and graphics for visual data representation.
Dan Hotka has several course offerings that use Toad and Toad Data Point.
If you have any questions, please post questions to the Toad Data Point forum on Toad World.