Capturing a Workload with WorkloadTools


Last week I showed you how to use WorkloadTools to analyze a workload. As you have seen, using SqlWorkload to extract performance data from your workload is extremely easy and it just takes a few keystrokes in your favorite text editor to craft the perfect .json configuration file.

Today I’m going to show you how to capture a workload and save it to a file. If you’ve ever tried to perform this task with any other traditional benchmarking tool, like RML Utilities or Distributed Replay, your palms are probably sweaty already, but fear not: no complicated traces to set up, no hypertrophic scripts to create extended events captures. WorkloadTools makes it as easy as it can get.

Saving a workload to a file might look superfluous when you think that WorkloadTools has the ability to perform replays in real-time (I’ll discuss this feature in a future post), but there are situations when you want to replay the same exact workload multiple times, maybe changing something in the target database between each benchmark to see precisely what performance looks like under different conditions.

Another scenario where saving the workload to a file comes handy is when the source and destination servers are on different isolated networks and cannot communicate directly: in this case, you will save the workload to a file, copy it to the network where the target server resides and perform the replay there.

Choosing a file format

Before I show you how to do it, please allow me to explain what type of file we are going to write to and why I decided to use this format. If you’re not interested in this kind of geeky stuff, you can skip to the next section, no offense taken.

As already mentioned in my previous blog post, SqlWorkload uses a Listener object that relies on the underlying technology to capture the execution events that form the workload, so we have specialized Listeners for SqlTrace and for Extended Events. However, the Listener reads the events from the underlying source and then discards the source immediately.

When the workload is captured with an Extended Events session and the events are processed with the streaming API for Extended Events, it is pretty obvious how this works and why there is no .xel file at the end of the capture: the events are never written to a file and they are processed on the fly. In the same way, when the workload is captured with SqlTrace, SqlWorkload makes sure to keep the trace files on the server’s file system as shortly as possible, so it reads the events from the tail of the active rollover file and forwards them to the Consumers immediately: again, at the end of the capture, there will be no trace file available.

Long story short: using one of the traditional capture formats would require to instruct SQL Server to leave the files on the file system, which is not what we want and often something we cannot do.

Moreover, choosing one of the traditional formats (SqlTrace or Extended Events) would be problematic: Extended Events only works with SQLServer 2012 and newer and SqlTrace can only be read and written with an old 32 bit COM API.

What I decided to use as the intermediate format for SqlWorkload is a SqLite database, that contains a handful of tables that persist all the queries captured by the Listener. Yeah, sure: when you’re a database professional it feels very natural to use a database to store information (if  all you have is a hammer, everything looks like a nail), but I can assure you that it is far from an unreasonable decision. Using a SqLite database allows you to inspect, modify and filter the workload data with extreme ease, using industry standard tools.

But enough talking about my decisional process: let’s go straight to the meat of this post.

Capturing a Workload

As you probably have guessed, capturing a workload is a matter of preparing a .json file with all the parameters and feeding it to SqlWorkload.

All you need is a Listener that captures the workload using SqlTrace or Extended Events and then a Consumer that takes care of writing to a SqLite file.

The .json file will be similar to this:

{
    "Controller": {

        "Listener":
        {
            // I'm using Extended Events here, but it could
            // be any type of listener
            "__type": "ExtendedEventsWorkloadListener",
            "ConnectionInfo":
            {
                "ServerName": "(local)"
            },
            "DatabaseFilter": "MyDatabase"
        },

        "Consumers":
        [
            {
                // The File Writer consumer takes care
                // of saving the workload to a file
                "__type": "WorkloadFileWriterConsumer",
                "OutputFile": "C:\\temp\\SqlWorkload.sqlite"
            }
        ]
    }
}

Once the configuration file is ready, you just need to pass its path as a command line argument to SqlWorkload and the Listener will start to forward all the events to the File Writer Consumer. You will see an output similar to this:

Capture

When you are done capturing the workload, you can stop SqlWorkload by pressing CTRL+C and you will be left with a SqLite file containing all your workload. You can actually open that file with a SqLite client of your choice and see what it contains. SqLite Browser is one of the possibilities, but there are countless options in Windows, Linux and Mac.

SqliteBrowser

Converting an existing SqlTrace or Extended Events file

You are probably wondering what to do if you already have a set of SqlTrace or Extended Events files that describe your workload perfectly: you probably don’t want to capture the workload again just for the sake of using it with SqlWorkload. Don’t worry: WorkloadTools also contains a command line tool to convert .trc and .xel files to the SqLite format understood by SqlWorkload.

All you have to do is run ConvertWorkload and pass the path to the source file and the path to the output file.

Conclusions

Again, it couldn’t get easier than that: SqlWorkload provides a Consumer to capture your workload to a file for later use, be it a replay or analysis. All you have to do is fill some info in a .json file and SqlWorkload takes care of the rest for you.

In the next post we will see how to replay a workload using WorkloadTools and we will see how one of the possible sources for the replay can be in fact one of these SqLite files produced by the File Writer Consumer.

Stay tuned!

Posted on March 12, 2019, in SQL Server and tagged , . Bookmark the permalink. 24 Comments.

  1. Great Solution!!!, I already tested it replicating workload from one to another server. Only one doubt Spaghettidba, I’m trying to use your ConvertWorkload app but I don’t find it in the WorkloadTool directory, I downloaded from G Hub repo? Thanks in advance.

  2. I wanted to capture multiple database on a single servers at the same time so I created a .bat file that will run three workload sessions simultaneously. however when I start the .bat file It will close the other two sessions. Did I miss something. I have attached the .bat files, workload scripts and log file.

    .bat file

    cd C:\Program Files\WorkloadTools\
    start cmd /k SqlWorkload.exe –File “Sampleondemand.json”
    start cmd /k SqlWorkload.exe –File “Sampleondemandone.json”
    start cmd /k SqlWorkload.exe –File “Sampleondemandtwo.json”

    I have created three .json files. sample Capture .json file as follows.
    {
    “Controller”: {

    “Listener”:
    {
    “__type”: “ExtendedEventsWorkloadListener”,
    “ConnectionInfo”:
    {
    “ServerName”: “testdb”,
    “UserName”: “testlogin”,
    “Password”: “password”
    },
    “DatabaseFilter”: “DummyDB”
    },

    “Consumers”:
    [
    {
    “__type”: “AnalysisConsumer”,
    “ConnectionInfo”:
    {
    “ServerName”: “AnalysisServer”,
    “DatabaseName”: “Analysisdb”,
    “SchemaName”: “Prod”,
    “UserName”: “testlogin”,
    “Password”: “password”
    },
    “UploadIntervalSeconds”: 60
    },
    {
    “__type”: “WorkloadFileWriterConsumer”,
    “OutputFile”: “C:\\temp\\SqlWorkload.sqlite”
    }

    ]
    }
    }

    log
    The event session has already been started.
    Error – WorkloadTools.WorkloadController : at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
    at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
    at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
    at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite)
    at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
    at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
    at WorkloadTools.Listener.ExtendedEvents.ExtendedEventsWorkloadListener.Initialize() in C:\GitHub\WorkloadTools\WorkloadTools\Listener\ExtendedEvents\ExtendedEventsWorkloadListener.cs:line 131
    at WorkloadTools.WorkloadController.Run() in C:\GitHub\WorkloadTools\WorkloadTools\WorkloadController.cs:line 42
    Info – SqlWorkload.Program : Controller stopped.
    Info – WorkloadTools.Consumer.WorkloadFile.WorkloadFileWriterConsumer : Closing the connection to the output file
    Info – WorkloadTools.Listener.ExtendedEvents.ExtendedEventsWorkloadListener : Disposing ExtendedEventsWorkloadListener.
    Warn – WorkloadTools.Listener.ExtendedEvents.ExtendedEventsWorkloadListener : Error disposing ExtendedEventWorkloadListener: Object reference not set to an instance of an object.
    Info – WorkloadTools.Listener.ExtendedEvents.ExtendedEventsWorkloadListener : Extended Events session [sqlworkload] stopped successfully.
    Info – SqlWorkload.Program : Controller disposed.

    • You can’t have multiple instances of sqlworkload running against the same server. You can capture multiple databases by passing in a list in the database filter attribute: [“database1″,”database2″,”database3”]
      Hope this helps

      • So as you said on your previous reply once I capture multiple database on filter attribute as [“database1″,”database2″,”database3”] ….is it the same procedure for the replay
        to apply captured changes into multiple databases like
        {
        “Controller”: {

        “Listener”:
        {
        “__type”: “FileWorkloadListener”,
        “Source”: “C:\\temp\\SqlWorkload.sqlite”,
        “SynchronizationMode”: “true”
        },

        “Consumers”:
        [
        {
        “__type”: “ReplayConsumer”,
        “ConnectionInfo”:
        {
        “ServerName”: “targetserver”,
        “DatabaseName”: “database1″,”database2″,”database3”,
        “UserName”: “username”,
        “Password”: “password”
        }
        }
        ]
        }
        }

      • You can capture using a filter like i showed with multiple databases. For the replay, each statement is executed against the same database where it was executed originally, unless you decide to change the target database like shown here https://spaghettidba.com/2020/03/31/replaying-workloads-to-a-different-database/

  3. If I want to capture the entire server Do I need to list all databases hosted on that server or is there any better alternative to put just the server name and capture all databases?

  4. Hi Spaghettidba, I came across your tool and started testing it but a little bit confused. I started sqlworkloas without DB filter, it created a an extended event session called sqlworkload ran it for more than 4 hours but nothing was written to the file. This xEvent was deleted when I stopped the capture.
    Also I have other Xevents running, I thought it will capture all events in thses XEvents.
    So maybe I am misunderstanding how the tool works. I understood that once it is pointed to a DB server then it will capture all events from extended events.
    Should we create always the Xevent when we run the tool or we can just point it to an exisiting XEvents?
    I saw in the documentation at https://github.com/spaghettidba/WorkloadTools/wiki/SqlWorkload, under ExtendedEventsWorkloadListener the following

    Available properties: [string SessionName = sqlworkload], not really I was able to understand how to use this property
    Thanks for your help

    • Hi Salam, I had a look at your json file and it looks like you are trying to record a workload to a sqlite file. The source SQL Server instance is SQL2K12 and the output file is c:\temp\SqlWorkload.sqlite.
      Your file looks ok, so you should be seeing some messages like “Writing event data to c:\temp\SqlWorkload.sqlite”. Do you see this?
      You should also see messages like “1000 events saved”, “2000 events saved” and so on. Do you see these?

      • No nothing I dont see Writing event data nor “1000 events saved” and the file size was always 0. One note, the sql server name is sql2k12 but it is a 2016 sql server. Again I will repeat the 2nd question, what is the idea behind the tool? Should we always prepare a script to create the Xevent or can we point it to an exisiting XEvent? Thanks for your help

      • >> Should we always prepare a script to create the Xevent or can we point it to an exisiting XEvent?

        You don’t need to prepare a session and you don’t need to create a script: the tool takes care of it all. I suggest that you read the documentation and maybe watch this video on youtube that describes the whole idea of the tool and how to use it https://www.youtube.com/watch?v=yqg2Y00IWx0

      • OK, I will watch the video, but please tell me what can be the issue when I ran it with the simple json file and nothing happened?

      • There seems to be nothing wrong with your file. I replaced the name of your server with one I have here and used it to test: it works. This is the output I get:

        C:\Users\gsartori>”%programfiles%\workloadtools\sqlworkload.exe” –File “C:\Users\gsartori\AppData\Local\Temp\eM Client temporary files\fxuaypqv\test.json”
        Info – SqlWorkload.Program : SqlWorkload, Version=1.6.2.0, Culture=neutral, PublicKeyToken=null 1.6.2
        Info – SqlWorkload.Program : Reading configuration from ‘C:\Users\gsartori\AppData\Local\Temp\eM Client temporary files\fxuaypqv\test.json’
        Info – WorkloadTools.Listener.ExtendedEvents.ExtendedEventsWorkloadListener : Reading Extended Events session definition from C:\Program Files\workloadtools\Listener\ExtendedEvents\sqlworkload.sql
        Info – WorkloadTools.WorkloadController : Listener of type ExtendedEventsWorkloadListener initialized correctly. Waiting for events.
        Info – WorkloadTools.Consumer.WorkloadFile.WorkloadFileWriterConsumer : Writing event data to C:\temp\SqlWorkload.sqlite
        Info – WorkloadTools.Consumer.WorkloadFile.WorkloadFileWriterConsumer : 1000 events saved
        Info – WorkloadTools.Consumer.WorkloadFile.WorkloadFileWriterConsumer : 2000 events saved

      • Hi, the video at https://www.youtube.com/watch?v=yqg2Y00IWx0 was very useful, now I can get the events. I updated the files in DebuggingTools so I capture events on a sql 2k16 benchmark table populated by the script generate-allWorkload.bat and consumer benchmark_analysis in a sql 2k19. Running the viewer I can see all metrics and it is fantastic and useful which I should beleive it should be the baseline. However, when running again, everything in the benchmark_analysis is deleted which means Basline disppaears and I can compare new run with 1st run so even if records are not deleted in benchmark_analysis, it seems to me the viewer will not know hos to compare the 2 runs, am I right or wrong? Thanks for your help

      • To capture a second workload, you need to change the destination schema in the json file, so that you can have two separate sets of tables. Once you have your data populated, you can run the workload viewer to see the contents and compare the two workloads. Makes sense?

  5. I forgot to mention that I downloaded the git repo and compiled the SqlWorkload

    • Hi Salam, what was your json file? Please email me if you find it more convenient. My email address is spaghettidba at sqlconsulting dot it.

    • Yes it makes sense and I was able to compare 2 sessions, it works like a charm and we can see that you have put a lot of efforts for this tool. Still I need please your clarification regarding my question about the DB filter which I removed from the capture.json file but still in the viewer, I see only benchmark DB and not all DBs. Also, an idea: I have developped a monitoring tool for Biztalk server 12 years ago (you can check it at my blog https://salam.hd.free.fr/BlogEngine/post/2010/01/09/Biztalk-Dashboard). In this tool I register a session ID in a single table where I can compare 2 or more sessions using session ids, can this be done easily in your tool

  6. Also, when I did the 2nd run, I removed the filter “DatabaseFilter”: “Benchmark”, but still in the viewer I see only Benchmark

  1. Pingback: Workload Capture with WorkloadTools – Curated SQL

  2. Pingback: Workload replay with WorkloadTools | spaghettidba

Leave a comment