FUDCon Pune 2015


This year’s FUDCon for the APAC region was held once more in the same city of Pune. Attending FUDCon reminded me of¬†2011 – the last time this event¬†was in¬†Pune. I had submitted some talks and sessions as I still feel more of an APAC guy even though I have changed zones for sometime now. Hoping that there would be enough folks interested to know what I have been working on for the last couple of years, I submitted a talk “Kernel and Userspace tracing with LTTng and friends”. You can see the slides here. Of course, systems performance consumes most of my waking hours and I thought that it would¬†benefit Fedora as well. I was happy when I saw that the talk was selected and there was an opportunity for me to share my experiences with others in Pune. Along with this talk I was also going to take a¬†Kenrel module workshop and AskFedora UX/UI hackfest that me and Sarup decided to run. I knew that my FUDCon would be packed ūüôā

I arrived on 24th night, all jetlagged and tired from a long journey. I met Izhar and Somvannda at Mumbai and we all set out for Pune. To our surprise, Siddhesh and Kushal were waiting for our arrival at 3am in the hotel. Thanks guys for your seamless¬†efforts in co-ordinating travel for speakers! (and of course a whole lot of other things you did). We quickly hit the sack. Most of the¬†next day was spent in doing some chores for FUDCon organization – packing the goodie bags with Ani and Danishka at Siddhesh’s house. We went to the Red Hat Pune office subsequently where I¬†met Jared, Shreyank, Prasad, Harish, Sinny et al.


Also, as you can see, Izhar was¬†not afraid of some fizzy-drinks¬†fireworks in the RH office as well. Chillax. It was just a photo-op ūüôā

Day 1

I had a very small selection of talks to attend. The day started with Harish’s keynote and then a Education panel discussion. I soon diverted to some other talks. I started with the kdump and kernel crash analysis workshop by Buland Singh and Gopal Tiwari. Their slides and explaination was good but unfortunateley the demo failed. I moved on to Sinny’s presentation on ABI compatability. This one was delivered quite well IMO. I wanted to attend Vaidik’s Vagrant talk but settled on for Samikshan’s talk on his “spartakus” tool to detect kernel ABI breakages. It was something done based on the “sparse” tool. I went to the FUDCon APAC BoF next to see how palnning was being done. I don’t remember exactly but probably the day ended with a visit to a local microbrewery.

Day 2

I met Sankarshan after a long time. He was manning the Fedora booth like a soldier in¬†the vanguard. I also saw the FUDCon T-shirts that I designed today. They looked quite well done which of course made me happy. I picked up some FUDCoins (aka Fedora pin-badges). Legend (me) says that you can not buy worldly stuff but just pure emotions with such coins. I soon moved to the opening keynote by Jiri was nice – mostly becasue he told us that the mp3 paptent was expiring soon and possibly Fedora would support mp3 soon out of the box. Next was my talk on tracing. Dunno how that went, but some folks met me in the end demanding the copy¬†of ¬†Brendan’s performance tools cheat-sheet. Felt nice that people there cared¬†about this ūüôā HasGeek folks tell me that the videos will be available soon. By that time, here are the slides. I continued to Pravin’s talk on Internationalization – quite nice, and then to an¬†old friend Kiran’s talk on Wifi internals. This one was sufficiently detailed and quite informative. I then went on to deliver a workshop on Kernel module programming where I basically started with a simple hello world module and ended with a netfilter hooks based small packet filter. Some first year students from Amrita univeristy looked very enthusiastic. They even met me and asked me how to begin kernel programming. I was impressed how much pumped up they were even in the first year about kernel proramming!

Look who's trying to bore people to death

Look who’s trying to bore people to death

This day ended with the customary FUDPub. We also spent the night¬†talking late at night about life universe and everything with Sinny and Charul – while seeing a buzzed Sarup struggling to make coffee and tea for us as he intermittently poured in his inputs ūüôā This was somewhat like the famous pink slippers incident of FUDCon 2011

Best. FUDPub. Ever.

I don’t think I can explain how awesome a FUDPub can be when you have awesome food, drinks and a whole bowling alley booked for the volunteers and speakers. It was truly awesome. We all agreed that this has set a threshold for all the future FUDPubs now!


Day 3

The last day was more of hackfests and some workshops such as Docker workshop by Aditya, Lalatendu and Shivprasad (which I did not attend, but have been told that it was really¬†good). I however attended a really good workshop on Inkscape by Sirko and then a small part of the Blender workshop by Ryan Lerch. It was nice seeing some folks pouring in with their Blender model renders in Harish’s keysigning party and looking content with they dancing cube ūüôā I am sure Ryan did an awesome job in showing them the power of Blender! I was tired by this time and the attendance was thinning, but me and Sarup still managed the AskFedora hackfest. There were a few folks but still we managed to get some good feedback on the UI done till now by our GSoC student Anuradha from particiapants Charul and Sinny. I have to¬†prepare a feedback soon for her so that she can make changes. We ended the day with yet another long night of discussions with Siddhesh, Kushal, Charul, Sarup and Sinny.

In the end, I would say – it was an awesome event. The quality of talks was really good. I hope it benefited students and the industry folks that attended these. Also, Sarup is an all round awesome guy and a nice roommate. I will update this if I remember something more and if I manage to get some more photos from the event.

EDIT: Added photos. Venue and my talk photo shamelessly taken from Sinny’s photostream on Flickr.

Embedded, Kernel, Linux

BPF Internals – I

Recent post by Brendan Gregg inspired me to write my own blog post about my findings of how Berkeley Packet Filter (BPF) evolved, it’s interesting¬†history and the immense powers it holds – the way¬†Brendan calls it ‘brutal’. I came across this while studying interpreters and small process virtual machines like the proposed KTap’s VM. I was looking at some known¬†papers on register vs stack basd VMs, their performances and various code dispatch mechanisms used in these small VMs. The review of state-of-the-art soon moved to native code compilation and a¬†discussion on LWN caught my eye. The benefits of JIT were too good to be overlooked, and BPF’s application in things like filtering, tracing and seccomp (used in¬†Chrome as well) made me interested. I knew that the kernel devs were on to something here. This is when I started digging through the BPF background.


Network packet analysis requires an interesting bunch of tech. Right from the time a packet¬†reaches the embedded controller on the network hardware in your PC (hardware/data link layer) to the point they do someting useful in your system, such as display something in your browser¬†(application layer). For connected systems evolving these days, the amount of data transfer is huge, and the support infrastructure for the network analysis needed a way to filter out things pretty fast. The initial concept of packet filtering developed keeping in mind such needs and there were many stategies discussed with every filter such as CMU/Stanford packet Filter (CSPF), Sun’s NIT filter and so on. For example, some earlier filtering approaches used a¬†tree based model (in CSPF) to represenf filters and filter them out using predicate-tree walking. This earlier approach was also inherited in the Linux kernel’s old filter in the net subsystem.

Consider an engineer’s need to have a probably simple and unrealistic filter on the network packets with the predicates P1, P2, P3 and P4 :


Filtering approach like the one of CSPF would have represented this filter in a expression tree structure as follows:


It is then trivial to walk the tree evaluating each expression and performing operations on each of them. But this would mean there can be extra costs assiciated with evaluating the predicates which may not necessarily have to be evaluated. For example, what if the packet is neither an ARP packet nor an IP packet? Having the knowledge that P1 and P2 predicates are untrue, we may need not have to evaluate other 2 predicates and perform 2 other boolean operation on them to determine the outcome.

In 1992-93, McCanne et al. proposed a BSD Packet Filter with a new CFG-bytecode based filter design. This was an in-kernel approach where a tiny interpreter would evaluate expressions represented as BPF bytecodes. Instead of simple expression trees, they proposed a CFG based filter design. One of the control flow graph representation of the same filter above can be:


The evaluation can start from P1 and the right edge is for FALSE¬†and left is for TRUE¬†with each predicate being evaluated in this fashion until the evaluation reaches the final result of TRUE or FALSE. The inherent property of ‘remembering’ in the¬†CFG, i.e, if P1 and P2 are false, the path reaches a final FALSE is remembered and P3 and P4 need not be evaluated. This was then easy to represent in bytecode form where¬†a minimal BPF VM can be designed to evaluate these predicates with jumps to TRUE or FALSE targets.

The BPF Machine

A pseudo-instruction representation of the same filter described above for earlier versions of BPF in Linux kernel can be shown as,

l0:    ldh [12]
l1: jeq #0x800, l3, l2
l2:     jeq #0x805, l3, l8
l3: ld [26]
l4: jeq #SRC, l4, l8
l5:     ld len
l6:     jlt 0x400, l7, l8
l7: ret #0xffff
l8: ret #0

To know how to read these BPF instructions, look at the filter documentation in Kernel source and see what each line does. Each of these instructions are actually just bytecodes which the BPF machine interprets. Like all real¬†machines, this requires a definition of how the VM internals would look like. In the Linux kernel’s version of¬†the BPF based in-kernel filtering technique they adopted, there were initially¬†just 2 important registers, A and X with another 16 register ‘scratch space’¬†M[0-15]. The Instruction format and some sample instructions for this earlier version of BPF¬†are shown below:

/* Instruction format: { OP, JT, JF, K }
 * OP: opcode, 16 bit
 * JT: Jump target for TRUE
 * JF: Jump target for FALSE
 * K: 32 bit constant

/* Sample instructions*/
{ 0x28,  0,  0, 0x0000000c },     /* 0x28 is opcode for ldh */
{ 0x15,  1,  0, 0x00000800 },     /* jump next to next instr if A = 0x800 */
{ 0x15,  0,  5, 0x00000805 },     /* jump to FALSE (offset 5) if A != 0x805 */

There were some radical changes done to the BPF infrastructure recently – extensions to its instruction set, registers, addition of things like BPF-maps etc. We shall discuss what those changes in detail,¬†probably in the next post in this series. For now we’ll just see¬†the good ol’ way of how BPF worked.


Each of the¬†instructions seen above are represented as arrays of these 4¬†values and each program is an array of such instructions. The BPF interpreter sees¬†each¬†opcode and performs the operations on the registers or data accordingly after it goes through a verifier for a sanity check¬†to make sure the filter code is secure and would not cause harm. The program which consists of these instructions, then passes through a dispatch routine. As an example, here is a small snippet from the BPF instruction dispatch¬†for the instruction ‘add’ before it was restructured in Linux kernel v3.15 onwards,

127         u32 A = 0;                      /* Accumulator */
128         u32 X = 0;                      /* Index Register */
129         u32 mem[BPF_MEMWORDS];          /* Scratch Memory Store */
130         u32 tmp;
131         int k;
133         /*
134          * Process array of filter instructions.
135          */
136         for (;; fentry++) {
137 #if defined(CONFIG_X86_32)
138 #define K (fentry->k)
139 #else
140                 const u32 K = fentry->k;
141 #endif
143                 switch (fentry->code) {
144                 case BPF_S_ALU_ADD_X:
145                         A += X;
146                         continue;
147                 case BPF_S_ALU_ADD_K:
148                         A += K;
149                         continue;
150 ..

Above snippet is taken from net/core/filter.c in Linux kernel v3.14. Here, fentry is the socket_filter structure and the filter is¬†applied to the sk_buff data element. The dispatch loop (136), runs till all the instructions are exhaused. The dispatch is basically a huge switch-case dispatch with each opcode being tested (143) and necessary action being taken. For example, here an ‘add’ operation on registers¬†would add A+X and store it in A. Yes, this is simple isn’t it?¬†Let us take it a level above.

JIT Compilation

This is nothing new.¬†JIT compilation of bytecodes has been there for a long time. I think it is one of those¬†eventual steps taken once an interpreted language decides to look for optimizing bytecode execution speed. Interpreter dispatches can be a bit costly once the size of the filter/code and the execution time increases. With high frequency packet filtering, we need to save as much time as possible and a good way is to convert the bytecode to native machine code by Just-In-Time compiling it and then executing the native code from the code cache. For BPF, JIT¬†was discussed first in the BPF+ research paper by Begel etc al. in 1999. Along with other optimizations (redundant predicate elimination, peephole optimizations etc,) a JIT assembler for BPF bytecodes was also discussed. They showed improvements from 3.5x to 9x in certain cases. I quickly started seeing if the Linux kernel had done something similar. And behold, here is how the JIT¬†looks like for the ‘add’ instruction we discussed before (Linux kernel v3.14),

288                switch (filter[i].code) {
289                case BPF_S_ALU_ADD_X: /* A += X; */
290                        seen |= SEEN_XREG;
291                        EMIT2(0x01, 0xd8);              /* add %ebx,%eax */
292                        break;
293                case BPF_S_ALU_ADD_K: /* A += K; */
294                        if (!K)
295                                break;
296                        if (is_imm8(K))
297                                EMIT3(0x83, 0xc0, K);   /* add imm8,%eax */
298                        else
299                                EMIT1_off32(0x05, K);   /* add imm32,%eax */
300                        break;

As seen above in arch/x86/net/bpf_jit_comp.c for v3.14, instead of performing operations during the code dispatch directly, the JIT compiler emits the native code to a memory area and keeps it ready for execution.The JITed filter image is built like a function call, so we add some prologue and epilogue to it as well,

/* JIT image prologue */
221                EMIT4(0x55, 0x48, 0x89, 0xe5); /* push %rbp; mov %rsp,%rbp */
222                EMIT4(0x48, 0x83, 0xec, 96);    /* subq  $96,%rsp       */

There are rules to BPF (such as no-loop etc.) which the verifier checks before the image is built as we are now in dangerous waters of executing external machine code inside the linux kernel. In those days, all this would have been done by bpf_jit_compile which upon completion would point the filter function to the filter image,

774                 fp->bpf_func = (void *)image;

Smooooooth…¬†Upon execution of the filter function, instead of interpreting, the filter will now start executing the native code. Even though things have changed a bit recently, this had been indeed a fun way to learn how interpreters and JIT compilers work in general and the kind of optimizations that can be done. In the next part of this post series, I will look into what changes have been done recently, the restructuring and extension efforts to BPF and its evolution to eBPF along with BPF maps and the very recent and ongoing efforts in¬†hist-triggers. I will discuss about my experiemntal userspace eBPF library and it’s use for LTTng’s UST event filtering and its comparison to LTTng’s bytecode interpreter. Brendan’s blog-post is highly recommended and so are the links to ‘More Reading’¬†in that post.

Thanks to Alexei Starovoitov,¬†Eric Dumazet and all the other kernel contributors to BPF that I may have missed. They are doing awesome work and are the direct¬†source for my learnings as well. It seems, looking at versatility of eBPF, it’s adoption in newer tools like shark, and with Brendan’s views and first experiemnts, this may indeed be the next big thing in tracing.

Embedded, Kernel, Linux

Jumping the Kernel-Userspace Boundary – Procfs and Ioctl

I recently had a need to have a very fast and scalable way to share moderate chunks of data between my experimental kernel module and the userspace application. Of course, there are many ways already available. Some of them are documented very nicely here. I will be writing in a few blog posts sharing what all mechanisms I have used to transfer data and provide such interfaces.


I have used the Procfs before (with the seq_file API) when I needed to read my experimental results back in userspace and perform aggregation and further analysis there only. It usually consisted of a stream of data which I sent to my /proc/foo file. From a userspace perspective, it is essentially a trivial read-only operation in my case,

/* init stuff */
static struct proc_dir_entry *proc_entry;
/* Create procfs entry in module init */
proc_entry = proc_create("foo", 0, NULL, &foo_fops);
/* The operations*/
static const struct file_operations foo_fops = {
    .owner = THIS_MODULE,
    .open = foo_open,
    .read = seq_read,
    .llseek = seq_lseek,
    .release = single_release,
/ *Use seq_printf to provide access to some value from module */
static int foo_print(struct seq_file *m, void *v) {
    seq_printf(m, val);
    return 0;

static int foo_open(struct inode *inode, struct  file *file) {
    return single_open(file, foo_print, NULL);
/* Remove procfs entry in module exit */
remove_proc_entry("foo", NULL);


I also used ioctl before (More importantly, I call them eye-awk-till. *grins*). They are used in situations when the interaction between your userspace applicatoin and the module resembles actual commands on which action from the kernel has to be performed. With each command, the userspace can send a message containing some data which the module can use to take actions. As an example, consider a device driver for a device which measures temperature from 2 sensors in a cold room. The driver can provide certain commands which are executed when the userspace makes ioctls. Each commad is associated with a number called as ioctl number which the device developer chooses. In a smiliar fashion to Procfs interface, file_operations struct can be defined with a new entry and initializations are done in the module,

/* File operations */
static const struct file_operations temp_fops = {  
       .owner = THIS_MODULE,  
       .unlocked_ioctl = temp_ioctl, 
/* The ioctl */
int temp_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) {
	switch(cmd) {
		copy_to_user((char *)arg, temp_buff, 8);
		copy_to_user((char *)arg, temp_buff, 8);

There are other complexities involved as well, such as using _IO(), _IOR() macros to define safe ioctl numbers. To know more about ioctl() call and how it is used, I suggest you read Chapter 7 from LKMPG. Note that newer kernels have some minor changes in code, hence refer to some device drivers using ioctls inlatest kernel releases. Each ioctl in our case means we have to use copy data from user to kernel or from kernel to user using copy_fom_user() or copy_to_user() functions. There is also no way to avoid the context switch. For small readings done ocassionally, this is an OK mechanism I would say. Consider that in a parallel universe, this sensor system aggregates temperature as well as a high quality thermal image in addition to each measurement. Also, there are thousands of such sensors spread across a lego factory and are being read each second from a common terminal. For such huge chunks of data accessed very frequently this each additional copy is a performance penalty. For such scenarios, I used the mmap() functionalty provided to share a part of memory between the kernel and userspace. I shall discuss more about Mmap in my next post.

Android, Java, Linux, UX/UI

QRite – QR Code Generator

UPDATE: Tanushri recently won a LG G Watch watch in PyCon 2015 (thanks to Google), so we have a watch to start implementing the smartwatch feature. The app will now be free on Google Play as well, as we don’t need money anymore¬†ūüôā

I had recently bought a smart-watch (SmartQ Z) and as mentioned in some of my previous posts, I started playing around and creating my own watch-faces. One fine day, I got an idea to put my contact details on the watch screen as a widget showing a QR code (with my vCard encoded) so that if someone asks me for my details, I could just flash my wrist and they could scan and save my contact info. Geek lvl over 9999.

So, I spent some a few evenings with a friend, re-inventing the wheel and created first my QR Code generation app. It was mostly an Android revision for us with some UX/UI brainstorming done now and then. The goal was to make it simple. Very simple. And clean. With just the necessary customizations given to the user. This is how it looks till now:


You can check out the beta attempt, get some more details and screenshots here : Its an open source app and if you want to use it or want to contribute to the source code, head over to this Github repo. The source is not so rosy and glorious considering I just focused on functionality and gave exactly 0 about anything else :/ It needs some cleanup for sure and it is indeed the next step.

So there are some features¬†already available as configuring color, size and recent contact details for which QR code was generated. And some features are to be added, such as digitizing an already available QR code using your camera and ‘scanning’. The major feature,¬†“Send to Smartwatch” is also left. The plans are to support Android Wear, Pebble and generic android (such as SmartQ Z) platforms. However, I don’t have Android Wear/Pebble for which this could be most useful. If you want to donate a Android Wear or Pebble¬†(if you have one extra) or some money for me to get those, or if you just want to support open source development, you can consider buying the app from this Play Store link¬†(it will be made free as soon as all features are¬†completed).¬†You can also send a donation by clicking¬†

Just FYI, my goal is approx USD $130 (price of cheapest LG W100 watch). Feedback and suggestions for the app are welcome.

TL;DR РJust another QR code generation app. Open sourced. Smartwatch integration in progress. Donate money for Android Wear/Pebble or submit code if you want.

Linux, UX/UI

Ask Fedora UX Redesign Updates #2

So taking into consideration most of the suggestions I had obtained during the recent Design FAD at Red Hat, the initial slides of desktop version are also ready. Maybe these are enough to start off with a basic CSS template which we can build upon iteratively while we work on more mockups for other major pages in parallel. Here is the updated mockup:

home-desktopJust a reminder that the color palette used here is not standard and its indicative. The standard Fedora color palette¬†will be used in the code. The next mockups to be done are for the views for individual question pages, contributor/people page, the ‘ask question’ page and maybe the registration page if required. Most of the other stuff I think would be best done directly in the code. I was anticipating a bigger challenge in streamlining the design but personally I think, as Askbot¬†apes the StackOverflow and similar Q&A websites’ UX to a large extent, the users already are familiar with flow of AskFedora. In any case, later on we can have something like this : – an Ask Fedora Tour page for beginners/new users so that the initial user-inertia¬†reduces.

Contributions and comments welcome.

Linux, UX/UI

Ask Fedora UX Redesign Updates #1

The recently concluded Fedora Design FAD 2015 at Red Hat’s Westford office¬†turned out to be quite productive where my idea of redesigning AskFedora user interface gathered some momentum. I met so many faces I had just always known as IRC nicks and above¬†of all,¬†enjoyed Sirko¬†complaining to everyone for not playing Champions of Regnum that well during the gaming nights we had ūüôā

So coming down to the point, I was joined by Zack (sadin) and Sarup (banas) for the initial redesign efforts. With their suggestions, I completed some high-fidelity mockups for the mobile and desktop interface intended to be used for¬†Ask Fedora. As suggested by Sarup, we started with Mobile mockups first to get a simplistic view of user requirements. It can then be “stretched” for a desktop interface later on. After some feedback from Sirko, Emily, Mairin, Marie et al. here is a first mockup slide.

home-mobileTo get a context of changes, you will have to visit the current AskFedora instance and observe the changes. Newer slides and the desktop views are on the way. Even though this takes time, its important to have an idea of how you want your interaction with the application to be and FAD gave me a change to do some mockups before we dive into code and CSS. Which reminds me, that Zack and I managed to setup an Openshift repo of an askbot instance one night and would be mostly using it for testing out our design. (I also learnt Openshift awesomeness that night). Most of real usable UI updates will be posted on this repo :

Your comments and suggestions are welcome. More updates on the way.

Android, Embedded, Linux, UX/UI

Some Cool Retro Watchfaces

These holidays, I was refraining from spending huge sums on stocking gadgets I don’t need. But (un)fortunately, I ended up buying a Nexus 5, a Moga Hero¬†controller and a cheap smartwatch. So what do you do when you get an Android device? Exactly! Start hacking on it – root, custom ROMs, custom kernel, cool apps! So, my experimentation started with the SmartQ Z Smartwatch I had bought. For the price (CAD $91, including shipping from China) this was an irresistible piece of tiny Android to start tinkering around with. It packs a 1GHz Ingenic processor (MIPS) – JZ4775, 512MB DDR¬†RAM¬†and 4GB flash. All of this with a tiny 1.54 inch screen with a 240×240 resolution. The latest firmware supplied by the vendor is based on Android 4.4 with a custom launcher. It is a very hacker-friendly device and came with ‘su’ out of the box ūüôā Though its a bit old (2013 launch) and the manufacturer seems to have abandoned the development on this, for the price, its a pretty impressive piece of tech on your hand.

I decided to make my own watchfaces with custom features and more inspiring UI than the watchfaces provided in the watch. Also, the hidden agenda was to check out the new Android Studio. As with everything Chinese, the SDK and docs for the Smartwatch were in Chinese! But fortunately, after surfing through XDA Developers forums, I found a link to the English docs on their website to refer to the weather APIs that I needed to use. With moderate efforts, I was able to make the following watch faces :

They are actually developed as widgets and the launcher apparently sees if the AppWidgetProvicer class you use starts with the string “WatchFace”. If this is the case, it simply puts that widget along with the custom watch faces in the menu.

If you own a similar watch/with similar screen dimensions, you can try installing the apps or use the source code for building your own cool watchfaces. I really wish I had an Android Wear watch rather than this so that I could develop on a more useful and up-to-date platform. Thanks for reading. Happy new year! Here is the source code and specs for these watch faces :

LCD Watch Face

Source :

Download : lcdwatchface.apk

CRT Watch Face

Source :

Download : crtwatchface.apk

If you find any bugs, report them on Github. Also, if you want me to port them on Android Wear, let me know. I’ll try to do that in emulator.