Summary
I am after some advice on the easiest way to analyze simple data using SQL server and .net
Details
Really simple data - just need really simple way to analyze (with my simple brain)
I have a SQL Server table:
- PKID (Int)
- ApplicationName (VarChar)
- MethodName (VarChar)
- TimeInMs (Integer)
- AdditionalInfo (VarChar)
- DateTime (DateTime)
This table records the length of time it took for various methods to run in various applications. This table could potentially have tens of thousands of rows. I would like to easily extract useful info from this (some of it in real time). I am not sure of the best way to go about this. This kind of data I would like is:
Data - Average length of time for method call - Top ten slowest method calls - Top ten fastest method calls
For the periods of: - last min, hour, day, week, month - each day for the last 7 days, each week for the last 10 weeks
For the applications: - All - Each individually
-
Without adding timestamp information you will not be able to do meaningful analysis. At best you can create queries to summary statistics on the applications performance.
select count(*) from table_name where ApplicationName = "BAR.EXE"; select sum(TimeInMs) from table_name group by ApplicationName;
Other than writing code to divide those numbers you cannot do very much.
Update: With timestamp information you can adjust the where clause of the above samples to select the ranges you are interested in. Given the inexact nature of your question I might suggest importing the data into Excel (I don't have Excel installed) and massaging the data in various ways rather than directly messing with SQL.
James : There is a DateTime column (It was missed off my example). It would be easy to add a timestamp column too if this proves useful. I have added them to my post above for clarity.ojblass : Do you have an easy way to run sql against the database? -
I think ojblass was refering to the DataTime field you omitted from your question.
The actual timestamp datatype in MS SQL Server is misleading in name. It has nothing to do with dates and times. It is a binary "version number". It is used mostly to deal with concurrency issues when updating a row in the table but would be useless for any analysis tasks.
I would suggest improving your column names a bit. Calling a column "DateTime" is confusing and could cause you some trouble in writing queries if you aren't careful.
Anyway... the queries you are looking for range from simple to quite complex if written directly in TSQL.
Here are some examples (I have not syntax checked these, so they are "approximate" at best):
Average time for a specific method
select avg(TimeInMs) as AvgTime from Table where ApplicaitonName = @ApplicationName
Average time for a specific method during the last 1 minute
select avg(TimeInMs) as AvgTime from Table where ApplicaitonName = @ApplicationName and [DateTime] >= DATEADD(minute, -1, getdate())
You'll end up wanting to write stored procedures for most of these. Some of the queries you talk about will require some grouping and such too... I recommend you get a book on TSQL if you go this route.
If you are doing this with LINQ to SQL within your applicaiton, it isn't much different, but in general LINQ is easier to write (debatable of course).
Here are the same two queries using LINQ to SQL in C# (again, I haven't tested these, so I could be minor syntax mistakes).
var ctx = new MyDataContext(); var q = (from item in ctx.Table where item.ApplicationName == "MyApplication" select item.TimeInMs).Average(); var ctx = new MyDataContext(); var q = (from item in ctx.Table where item.ApplicationName == "MyApplication" && item.DateTime <= DateTime.Now.AddMinutes(-1) select new item.TimeInMs).Average();
How you do the analysis depends on what technologies you are using and what you are doing with the results.
Update: In answer to follow-up question from comments:
I can't think of a good way to handle it via storing the desired time intervals in another table that doesn't get massivly difficult (cursors and dynamically constructed TSQL via the Execture command).
A simpler query that gets the results you want might look like this in TSQL (I'm not advocating that this is the "best" way, but it works and is pretty fast).
select avg(TimeInMs) as AvgTime, 'Last 1 minute' as TimePeriod from Table where ApplicaitonName = @ApplicationName and [DateTime] >= DATEADD(minute, -1, getdate()) union select avg(TimeInMs) as AvgTime, 'Last 2 minutes' as TimePeriod from Table where ApplicaitonName = @ApplicationName and [DateTime] >= DATEADD(minute, -2, getdate()) -- //repeat union as many times as needed for each time period
James : But say I wanted to get the average time for a method for the following number of minutes: 1,2,3,4,5,6,7,8,9,10,20,30,40,50,60,120,240,360 Is there a way to, define the list of minuets in another table, then do some kind of join to get the results for all of these? I can see how to get the simple single pieces of info back now (and thanks for your input and Link examples). I am really after the best way to get the "multiple" pieces of data back at once (like the minutes example). I am surprised for such a simple seeming dataset, it is so hard!Stephen M. Redd : I amended my answer to include one example of what you are asking about. The problem here is the the queries you want are only simple for the human mind, but are actually reasonably complex problems from a SQL point of view. There are ways in TSQL to write very complex selects like you are asking about more elegantly, but such an answer is far outside the scope of what can be reasonably answered on a simple SO post. I'd recommend some serious training or reading on advanced TSQL if you'll be doing much of this kind of stuff in the future.James : Many thanks - that gives me a good starting point.
0 comments:
Post a Comment