Collecting QVD metadata is important for many reasons. Let’s look at some code you can adapt to help you collect metadata for all your QVD generating applications.
Collecting QVD Metadata Aids Debugging and Alerts You to Problems
Most organizations will use QVD Generators to build out QVD data layers suitable for QlikView’s consumption. There is generally no need to create any visualizations in these QVD Generators except for maybe some validation tables and some QVD metadata. Today we will focus on an easy way to gather those statistics. The QVD metadata I generally gather help me answer questions regarding the output QVDs.
- Which QVDs were written with this application?
- How long did it take to create this QVD?
- How many rows were written?
- How many fields were in each table?
Obviously, being able to easily answer these questions helps us better understand and serve the BI enterprise. It also provides a helpful tool for debugging logical errors in our QlikView scripts.
And because we actually take this statistics table and write it to QVD as well, we can use it in later loads as needed to create a data source lineage for a particular application. Another benefit of creating this metadata QVD is that I can drop all the tables that created the QVDs while still retaining the statistics. This leaves the application light and quick to save.
Here is an example of the front end to show you what kind of data will be collected and displayed. Of course, the metadata you collect and present are really up to you, the developer.
How to set up the collection
I use blocks of code that can be basically put into any script and adjusted as needed. We basically have four chunks of code that get deposited into our script.
- Set up any needed global variables and set up the statistics table structure.
- Set up a sub-routine that creates the stat table.
- Change the start time and table name each time we go to a new QVD.
- Call the subroutine.
Set up any needed global variables and set up the statistics table structure
First, create any variables that will need to be collected in your metadata or displayed in your front end. For this example, we are using a variable to determine my directory, the name of my environment, the source, division and the path for my statistics QVD. These two chunks of script should be inserted towards the beginning of the script before any loading occurs.
setvDir = ‘C:UsersAaronDocumentsQlikView AppsBlog AppsStatistics Collection’;
setvEnv = ‘Dev’;
//set vDir = ‘D:QlikView DocumentsProductionSales’;
// Set variables for qvd naming
setvSource = ‘BAAN’;LETStatisticsPath = ‘$(vDir)ETL_Statistics_$(vDivision)_$(vSource).qvd’;
We also will look for an existing statistics QVD and if one is not found, we will create it.
ELSE// The Statistics QVD needs to be created
LOAD * INLINE[
TableName, Division, Source, ReloadedDate, StartTime, EndTime, Duration, Records, Fields
, ];STORE Statistics INTO$(StatisticsPath)(qvd);
Set up a sub-routine that creates the metadata table
Next we create a subroutine that appends the QVD metadata each time we load new data into a QVD. This script records all of the needed data from whatever load has just occurred, adds it to the existing statistics records and then saves the whole thing back to QVD. This script gets inserted just below the prior set up stuff and will be called at various times later in the script.
// There must be a vStartTime and a vTable variable
LETvEndTime = NOW();
LETvRecords = NoOfRows(‘$(vTable)’);
LETvFields = NoOfFields(‘$(vTable)’);
LOAD * FROM$(StatisticsPath)(qvd) WHERETableName <> ‘$(vTable)’;
LOADDISTINCT ‘$(vTable)’ ASTableName, ‘$(vDivision)’ ASDivision, ‘$(vSource)’ asSource, DATE(NOW()) ASReloadedDate, ‘$(vStartTime)’ ASStartTime, ‘$(vEndTime)’ ASEndTime,
INTERVAL(NUM(‘$(vEndTime)’) – NUM(‘$(vStartTime)’),’hh:mm:ss’) ASDuration, $(vRecords)ASRecords, $(vFields)asFields
STORE Statistics INTO$(StatisticsPath)(qvd);
Change the start time and table name each time we go to a new QVD
So now we go about the business of loading data from our data sources and creating QVDs. The basic idea is that we will need to establish some variables at the beginning of each load and then run the subroutine at the end of each load. Using this approach we can collect statistics for every QVD that gets created. Start by adding this script to the top of each data loading section. Note the vTable variable could also be used within your load statements to name your table.
LETvStartTime = NOW();
SETvTable = ‘ETL_Sales_Header’;
Call the subroutine
Finally, we can call the statistics subroutine that will record our ending variables and append the QVD. This simple code will run the routine we created in the second step.
Place this code after we store each data QVD. One thing to note is that you should wait to drop the data QVD until after this subroutine runs or simply drop them all at the end of your script.
Result Data had a good article discussing this subject as well.
If you want some guidance on using QVDs in our Qlik architecture, you might read this article.
There are many ways we can simplify the data we are collecting or expand the data. Hopefully this structure is clear and economical enough for you to adjust to your needs. The example I provided is simply what I use to complete this task and I tweak the code depending on the situation.
As always, I hope you find this valuable. Please let me know what you think. And if you accomplish QVD metadata collection differently or can improve upon this, please participate in the discussion.