root cause analysis

Speedscale was built primarily to provide engineering teams with better insight into their applications over time, replaying single transactions for root cause analysis that give developers and SREs confidence that tomorrow’s application code will work just as well in production as it did yesterday.

SREs typically use the web console to analyze production traffic and pick out snapshots, a section of traffic based on time or other filters, which run as new code is built. Given this, the standard interaction for developers is much the same as for any automated build step.  If Speedscale detects that the new version of the application would create errors, high latency, or any other bad customer experience the pipeline step will fail. When the step fails the developer will have to revisit their code and make the necessary changes.  Crisis averted.  This is the primary use case for Speedscale today and stops many bugs from reaching customers.

Unfortunately, nothing is foolproof and some bugs will inevitably make it through. When they do it can be difficult or impossible to recreate the exact circumstances a customer was facing when they experienced an issue.  The context is lost and developers must do what they can with limited information from customer reports and logs.  But with Speedscale, we can do better.

Let’s look at the current flow for most development teams:

  1. Find the logged request or webpage where a particular issue has occurred
  2. Copy/paste request data from logs or the browser
  3. Create a curl command from the data available.  Logs require creation from scratch and curl from a sample webpage usually requires tweaking bits to make sure it’s as close as possible to the production request.
  4. Tweak the curl command to point to localhost
  5. Change any other details that might have been different for the customer
  6. Make the request and find the issue or go back to the start with another request

But maybe a better question is, if we’re trying to recreate production conditions, why not just use production data?

In addition to the web portal, Speedscale maintains an engineer-focused command line tool, speedctl.  Using the speedctl act command we can replay specific requests from production locally, cutting out most of the copy/paste and all of the guesswork.  With this tool a developer can run commands like speedctl act -s 007f4ccc-2fad-48ff-9182-8bc8465360a4 -u http://localhost:8080 and see the production traffic streaming into their application with unrestricted freedom.  This gives time back to developers by essentially putting their local code into the production environment’s context.

 

Speedscale helps you generate traffic scenarios and automate scalable testing so you maximize developer hours and slim down processes, all while preventing production incidents. Why not schedule a demo today?

Longer-Log

Stress test your APIs with real world scenarios.  Collect and replay traffic without scripting.

Newsletter Signup