Deconstructing Perf’s Data File

It is no mystery that Perf is like a giant organism written in C with an infinitely complex design. Of course, there is no such thing. Complexity is just a state of mind they would say and yes, it starts fading away as soon as you get enlightened. So, one fine day, I woke up and decided to understand how the perf.data file works because I wanted to extract the Intel PT binary data from it. I approached Francis and we started off on an amazing adventure (which is still underway). If you are of the impatient kind, here is the code.

A Gentle Intro to Perf

I would not delve deep into Perf right now. However, the basics are simple to grasp. It is like a Swiss army knife which contains tools to understand your system from either a very coarse to a quite fine granularity level. It can go all the way from profiling, static/dynamic tracing to custom analyses build up on hardware performance counters. With custom scripts, you can generate call-stacks, Flame Graphs and what not! Many tracing tools such as LTTng also support adding perf contexts to their own traces. My personal experience with Perf has usually been just to profile small piece of code. Sometimes I use its annotate feature to look at the disassembly to see instruction profiling right from my terminal. Occasionally, I use it to get immediate stats on system events such as syscalls etc. Its fantastic support with the Linux kernel owing to the fact that it is tightly bound to each release, means that you can always have reliable information. Brendan Gregg has written so much about it as part of his awesome Linux performance tools posts. He has some some actual useful stuff you can do with Perf. My posts here just talks about some of its internals. So, if Perf was a dinosaur, I am just talking about its toe in this post.

Perf contains a kernel part and a userspace part. The userspace part of Perf is located in the kernel directory tools/perf. The perf command that we use is compiled here. It reads kernel data from the Perf buffer based on the events you select for recording. For a list of all events you can use, do perf list or sudo perf list. The data from the Perf’s buffer is then written to the perf.data file. For hardware traces such as in Intel PT, the extra data is written in auxiliary buffers and saved to the data file. So to get your own custom stuff out from Perf, just read its data file. There are multiple ways like using scripts too, but reading a binary directly allows for a better learning experience. But the perf.data is like a magical output file that contains a plethora of information based on what events you selected, how the perf record command was configured. With hardware trace enabled, it can generate a 200MB+ file in 3-4 seconds (yes, seriously!). We need to first know how it is organized and how the binary is written.

Dissection Begins

Rather than going deep and trying to understand scripted ways to decipher this, we went all in and opened the file with a hex editor. The goal here was to learn how the Intel PT data can be extracted from the AUX buffers that Perf used and wrote in the perf.data file. By no means is this the only correct way to do this. There are more elegant solutions I think, esp. if you see some kernel documentation and the uapi perf_event.h file or see these scripts for custom analysis. Even then, this can surely be a good example to tinker around more with Perf. Here is the workflow:

  1. Open the file as hex. I use either Vim with :%!xxd command or Bless. This will come in handly later.
  2. Use perf report -D to keep track of how Perf is decoding and visualizing events in the data file in hex format.
  3. Open the above command with GDB along with the whole Perf source code. It is in the tools/perf directory in kernel source code.

If you setup your IDE to debug, you would also have imported the Perf source code. Now, we just start moving incrementally – looking at the bytes in the hex editor and correlating them with the magic perf report is doing in the debugger. You’ll see lots of bytes like these :

Screenshot from 2016-06-16 19-01-42

A cursory looks tells us that the file starts with a magic – PERFFILE2. Searching it in the source code eventually leads to the structure that defines the file header:

struct perf_file_header {
   u64 magic;
   u64 size;
   u64 attr_size;
   struct perf_file_section attrs;
   struct perf_file_section data;
   /* event_types is ignored */
   struct perf_file_section event_types;
   DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
};

So we start by mmaping the whole file to buf and just typecasting it to this. The header->data element is an interesting thing. It contains an offset and size as part of perf_file_section struct. We observe, that the offset is near the start of some strings – probably some event information? Hmm.. so lets try to typecast this offset position in the mmap buffer (pos + buf) to perf_event_header struct :

struct perf_event_header {
   __u32 type;
   __u16 misc;
   __u16 size;
};

For starters, lets further print this h->type and see what the first event is. With our perf.data file, the perf report -D command as a reference tells us that it may be the event type 70 (0x46) with 136 (0x88) bytes of data in it. Well, the hex says its the same thing at (buf + pos) offset. This in interesting! Probably we just found our event. Lets just iterate over the whole buffer while adding the h->size. We will print the event types as well.

while (pos < file.size()) {
    struct perf_event_header *h = (struct perf_event_header *) (buf + pos);
    qDebug() << "Event Type" << h->type;
    qDebug() << "Event Size" << h->size;
    pos += h->size;
}

Nice! We have so many events. Who knew? Perhaps the data file is not a mystery anymore. What are these event types though? The perf_event.h file has a big enum with event types and some very useful documentation. Some more mucking around leads us to the following enum :

enum perf_user_event_type { /* above any possible kernel type */
    PERF_RECORD_USER_TYPE_START = 64,
    PERF_RECORD_HEADER_ATTR = 64, 
    PERF_RECORD_HEADER_EVENT_TYPE = 65, /* depreceated */
    PERF_RECORD_HEADER_TRACING_DATA = 66,
    PERF_RECORD_HEADER_BUILD_ID = 67,
    PERF_RECORD_FINISHED_ROUND = 68,
    PERF_RECORD_ID_INDEX = 69,
    PERF_RECORD_AUXTRACE_INFO = 70,
    PERF_RECORD_AUXTRACE = 71,
    PERF_RECORD_AUXTRACE_ERROR = 72,
    PERF_RECORD_HEADER_MAX
};

So event 70 was PERF_RECORD_AUXTRACE_INFO. Well, the Intel PT folks indicate in the documentation that they store the hardware trace data in an AUX buffer. And perf report -D also shows event 71 with some decoded PT data. Perhaps, that is what we want. A little more fun with GDB on perf tells us that while iterating perf itself uses the union perf_event from event.h which contains an auxtrace_event struct as well.

struct auxtrace_event {
    struct perf_event_header header;
    u64 size;
    u64 offset;
    u64 reference;
    u32 idx;
    u32 tid;
    u32 cpu;
    u32 reserved__; /* For alignment */
};

So, this is how they lay out the events in the file. Interesting. Well, it seems we can just look for event type 71 and then typecast it to this struct. Then extract the size amount of bytes from this and move on. Intel PT documentation further says that the aux buffer was per-CPU so we may need to extract separate files for each CPU based on the cpu field in the struct. We do just that and get our extracted bytes as raw PT packets which the CPUs generated when the intel_pt event was used with Perf.

A Working Example

The above exercise was surprisingly easy once we figured out stuff so we just did a small prototype for our lab’s research purposes.  There are lot of things we learnt. For example, the actual bytes for the header (containing event stats etc. – usually the thing that Perf prints on top if you do perf report --header) are actually written at the end of the Perf’s data file. How endianness of file is determined by magic. Just before the header in the end, there are some bytes which I still have not figured out (near event 68) how they can be handled. Perhaps it is too easy, and I just don’t know the big picture yet. We just assume there are no more events if the event size is 0 😉 Works for now. A more convenient way that this charade is to use scripts such as this for doing custom analyses. But I guess it is less fun that going all l33t on the data file.

I’ll try to get some more events out along with the Intel PT data and see what all stuff is hidden inside. Also, Perf is quite tightly bound to the kernel for various reasons. Custom userspace APIs may not always be the safest solution. There is no guarantee that analyzing binary from newer versions of Perf would always work with the approach of our experimental tool. I’ll keep you folks posted as I discover more about Perf internals.

Advertisements

Qt Apps on Android! Part Two : An App(le) a day

No guys, this post is not related to Apple Inc or Steve Jobs but to my previous post 🙂 We now are in a position to have our development setup ready for Qt app development on Android so lets begin with the actual stuff. I shall take an example of the digital clock app you had seen in the previous post (reproduced here for your sake).

For some Qt newbies, its also going to be a tutorial on using Qt Creator effectively. We shall cover UI design and then do some coloring and stuff like that to make it more beautiful. Then we shall code the app so that your clock works.

Requirements

For your reference, I have put up this simple app on my git repo or maybe you can get the tarball from here

Step 1

Start the Necessitas Qt Creator and create a new Qt Gui Application from File > New File or Project > Qt Widget Project > Qt GUI Application

Step 2

Choose the project name and location and after that choose the Qt version as  Qt for Android which we created in the last post.

You can also select the Desktop version to for prototyping your app for the desktop x86 host. Once the project is created you can see the auto generated files under Project as shown below. The file tuxologycloxk.cpp is the one in which all the logic goes.

Step 3

Under Forms, click the .ui file and start making the UI. Its a pretty easy job actually, you have to drag and drop the required widgets and arrange them properly in something called as layouts. Just analyse a bit how I have created the UI for the clock.

You can drag and drop the Widgets from the left panel to the form view and the corresponding Objects will be created in the right top panel as shown above. The property for each project can be set directly from here only. For eg. the initial value (initValue) for the lcdNumber object has been set as 1200 above. You can actually set the widgets background as well as the whole application colour palette by changing properties of the respective objects.

Notes on StyleSheets

You can also apply styleSheets to make your app a bit beautiful too. For example the Exit text that you see in the application is actually a button with some styles applied. You can set styles using the UI editor quite easily. Just right click the corresponding widget and click on Change styleSheet. You will get window as shown below in which you can apply your desired style.

The above stylesheet changes a button from the boring button widget to a sleek black button which mixes well with the application’s look and feel.

Step 4

Look closely and you will understand that writing code is no big deal too. Just refer my digital clock source and browse through the code to understand it. Just a small reminder on creating signals and slots – You can click on the widget directly to create slots for the specific signals they will emit. For eg. right click Exit and select Go to slot.. A dialog box will ask you the signal which will be emitted and when you hit Ok a slot in the code will  be automatically generated. Now you can write whatever code you want to implement in that slot.

Step 5

Assuming that you have created the application, you can do some other settings too. Just click on Projects on the left panel and you will see different targets for you application. We had opted for Android as well as Desktop in the beginning so both shall be shown here. Click on Run and then Details under Package Configuration. You will see some configuration tabs as shown below. You can fine tune some stuff from here of course such as Android Permissions, app name, app icon etc.

You also have an option to either deploy local Qt libs for the device or use device’s libs. If you have installed Ministro from Android market to your device, just leave it to use the devices qt libs. However, if you are going to use the emulator, make sure you get the Ministro apk from here and install it on the emulator by selecting the third option below.

Once all is set its time to connect your device, set platform as Android and hit Run (Ctrl+R) You can see the compile output on the compile output window and the debug messages in Application Output window (hit Alt+3 or Alt+4 to switch) Watch out for any build issues too. I hope the same stuff works fine with an AVD too as I haven’t tried that out actually. I do all my testing on my rooted Sony Xperia mini x10 Pro and the first image in this post is what you should get if you try to build the TuxologyClock project for your device.

Thats all Folks! Happy hacking!

Qt Apps on Android! Part One : <3 is in the Air :)

Have you loved two tangentially apart technologies at the same time? Its like holding one girl’s hand while you woo another one 😉 Yeah something like that is the case with me. To my girl – “Its OK sweety, I’m just talking about Qt and Android :)”

There must have been a time when you would have thought, “Oh God, I wish I could just port all these apps I run on my desktop to my new android phone.” Or maybe you are one of hose who say, “I wish I could use my Android cell to prototype my new Qt based embedded device that I am making. It’d be something cool to show to those black shoes, red tie morons in the conference room.”

The Necessitas project comes to your aid guys. I shall be writing a short tutorial series on creating small Qt app like these :

in the speediest of ways and port it to your device. This part will consist of setting up the tools necessary for Qt application development on Android

Necessitas

Also known as Android Lighthouse project, this is the individually developed port of Qt for Android. Necessitas comes with a modified Qt Creator IDE for building, deploying and even debugging your applications directly for your Android device. You will be amazed to see the ease with which you can develop and debug your apps. Say thanks to BogDan Vatra and those unsung heroes who have brought this to you. Now lets begin.

Get Necessitas SDK

Get the Necessitas 0.3 online installer from here. I however downloaded the 0.1.1 version available as an offline install which serves the purpose well. Its available in old versions directory. The installation is pretty straight forward. Just run the installer and make sure that you install the SDK in /opt/necessitas. You may have to make your /opt 777 for sometime and then revert back to 755 once the installation is over.  The SDK mainly consists of the cross compiler for android on ARM and lots of cross compiled ARM libs for Qt. I have mentioned in previous posts how to do all that manually but here, its all ready for you 🙂 Once the installation is over, you will get a Necesitas Qt Creator in your applications. This is almost same as your traditional Qt Creator IDE. We shall move on to configure it now.

Configure Qt Creator

Requirements :

  • Install ant if required by yum install ant
  • Check whether you have JDK with java -version
  • Get Android SDK from here
  • Get Android NDK from here

Step 1

Extract the SDK and NDK at some locations and start Necessitas Qt Creator/Qt Creator for Android and go to Tools>Options. Click Qt4 tab and Add a new qmake path. Give this new qmake path from /opt/necessitas/Android/<qtversion>/bin/qmake This qmake will make the projects and makefiles cross-compile ready. Give some name to it – maybe Qt for Android

Step 2

Now that you have the new Qt setup, Click the Android tab on the left and specify the SDK and NDK target and set proper toolchain as shown below. Also set the ant location and hit Apply

If you are not having any Android device, then create a AVD to test your app. Lastly, some configuration is also required on your device.

Step 3

Now, we have almost everything ready for development on our device, however to run a Qt app we need libraries on the target device. For this, there are two options. Either while developing application, an option to use local Qt libs can be selected or a nifty tool called Ministro can be used. Ministro is an android application that can be downloaded from the Android market. This application performs a one time download of Qt libs from the net on the device as required by the application you have created. In a simple application mostly it will do a mostly 8Mb install of QtCore and QtGui modules.

The next post will describe how to create a small digital clock app (as shown above) using the Qt Creator, something about putting Style Sheets in Qt apps and then get it on your device! Keep experimenting.

Source : http://sourceforge.net/p/necessitas/home/necessitas/

Qt 4.6 on mini2440 – A Definitive Guide

Having Qt on your device is a awesome way to ensure that you and many other developers can enjoy developing applications and having fun in addition to doing rapid application development whenever required. The wealth of support provided by the Qt framework – right from OpenGL to WebKit tempts a embedded developer for sure. So I decided its time for a change. Lets dump the overkiiling Android and take shelter under Qt’s canvas for my device. I kept on roaming the intricately laid out web of information on having Qt 4.6.2 run successfully on my mini2440. But owing to the strange 12.1″ LCD that I have attached to it I was more-or-less stuck on the Touchscreen calibration part. So Now after hits and trials and many cups of tea, I am assembling a guide. I intend to make if definitive but only the willful enough to dare can tell.

What You Need

Qt 4.6.2 

GNU ARM Toolchain

Step 1

Setup the toolchain and modify the PATH variable accordingly. Untar qt-everywhere-opensource-src-4.6.2.tar.gz wherever you like. Mine is /usr/local/qt

Step 2

Replace the whole text in mkspecs/qws/linux-arm-g++/qmake.conf by the following:

#
# qmake configuration for building with arm-linux-g++
#

include(../../common/g++.conf)
include(../../common/linux.conf)
include(../../common/qws.conf)

# modifications to g++.conf
QMAKE_CC                = arm-none-linux-gnueabi-gcc -msoft-float -D_GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=arm920t -O0 -lts
QMAKE_CXX               = arm-none-linux-gnueabi-g++ -msoft-float -D_GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=arm920t -O0 -lts
QMAKE_LINK              = arm-none-linux-gnueabi-g++ -msoft-float -D_GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=arm920t -O0 -lts
QMAKE_LINK_SHLIB        = arm-none-linux-gnueabi-g++ -msoft-float -D_GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=arm920t -O0 -lts

# modifications to linux.conf
QMAKE_AR                = arm-none-linux-gnueabi-ar cqs
QMAKE_OBJCOPY           = arm-none-linux-gnueabi-objcopy
QMAKE_STRIP             = arm-none-linux-gnueabi-strip
QMAKE_RANLIB            = arm-none-linux-gnueabi-ranlib

load(qt_config)

Remember that the PATH variable should have the location of arm-none-linux-gnueabi-gcc etc.

Step 3

Now turn off compiler optimization by making changes in /mkspecs/common/g++.conf Just change the following line to this :

QMAKE_CFLAGS_RELEASE  += -O0

Step 4

Now we have to configure Qt the way we want it by doing ./configure but we need to specify some options according to our requirements.

./configure -embedded arm -xplatform qws/linux-arm-g++ -prefix \
/usr/local/qt -little-endian -webkit -no-qt3support -no-cups -no-largefile \
-optimized-qmake -no-openssl -nomake tools -qt-mouse-tslib -qt-kbd-linuxinput

Now, this configuration is what I needed, for example for my network enabled device I needed a webkit based application so I provided -webkit option. You can drop in or add whatever you want. Some handy options that you may (or may not require) are :

Enable touchscreen library support : -qt-mouse-tslib

Enable USB keyboard support : -qt-kbd-linuxinput

The above options affect the QtGui library so you need to replace only QtGui.so.x.x file on your root filesystem if you are planning to make changes.

Step 5

So more or less you are done. Its time to do some creative stuff (blah!) just do

make

Wait for a couple of hours or so and find your cross compiled freshly baked libraries in /usr/local/qt/lib/

Setting up Qt 4.6 libs on mini2440

So now, its time to put the libs on your mini machine! You may choose to create a separate filesystem using busybox too. Elaborate tutorials on that are available on the web. Just Google it up. What I will do is, modify the rootfs that came with my mini and remove all unnecessary stuff from it. Create a dir /usr/local/qt on your mini2440’s stock root filesystem and copy the complete lib directory from step 5 to that location using a USB stick/SD Card or something. Remember to do a df on the mini beforehand to see the free NAND you have got. In case the memory is low, delete some stock data – the small video file, mp3 file and some sample images that are present on it.Also remove unnecessary applications viz konqueror and old qt 2.2.0 libraries from the system.

Environment Variables

To make the mini understand the new Qt libs, we’ll add some variables

On your mini2440, edit /etc/inint.d/rcS and add the following line to it

source /etc/qt46profile

Now create the /etc/qt46profile file and the following text to it :

export LD_LIBRARY_PATH=/usr/local/qt/lib
export QTDIR=/usr/local/qt
export QWS_MOUSE_PROTO=IntelliMouse:/dev/input/event0
export QWS_DISPLAY=LinuxFB:mmWidth=310:mmHeight=190

Remember that these variables may change according to your qt libs location and mouse/touchscreen drivers. Go to /dev/input/ and see which file is responsible for which device. For eg. if tslib is configured on your device then the QWS_MOUSE_PROTO  variable will have the value something like

QWS_MOUSE_PROTO=Tslib:/dev/input/event1

So, you are done at last and have your system ready with Qt 4.6.2 libs. Try running a simple application (viz a Analog Clock from the cross-compiled qt examples at /usr/local/qt/examples/widgets/analogclock/analogclock by giving the -qws option on the mini2440 shell as

./analogclock -qws

Thats it! Do tell me if you are able to reach here. In the next two posts I shall be discussing about rapidly developing a GUI application for your mini using Qt Creator and compiling and configuring Tslib for mini2440.