Summary of Readings (July 2016)

  • K&R The C Programming Language

    No need for introduction still best practical approach for remembering C syntax. Some idioms are not considered good practice anymore but still best way to get familiar with C.

  • Understanding and Using C Pointers

    If you feel void about pointers topic this is the book you are looking. Even though there are great topics that author can deep dive into, this is great book about pointers.

  • C in a Nutshell

    This is the most recent C reference book (published on 2015) I have ever come across. I did not read the whole book just interesting part about must have tools like gcc, make, gdb. I can not stand reading so cryptic gdb, make or other manuals. This is just way better.

    Follow me on goodreads

Back to Basics

Its been long time I haven been not writing any technical writing here. Don’t know if its related to age or laziness. Anyway this will be kind of technical writing so sit tight.

As a human being its really hard to observe yourself as outsider, especially if you are software developer this become harder. Too much stuff going on with software development world people create new awesome tools, design esoteric programming languages (or not), form top notch frameworks so in short people doing really great stuff. Consuming/learning all of these tools, languages, systems etc. probably will probably take my whole life. So I decide to stop and be outsider to myself. Couple of question I asked myself

  • What essential knowledge am I missing?
  • Do I have enough practical knowledge about essential topics?
  • Do I have a methodology for learning new stuff?
  • Am I learning by constantly hitting walls or by absorbing knowledge beforehand?

These were just some of the questions I asked myself and as a result I decide to stopped learning new shiny stuff for filling those knowledge gaps and absorbing missing essential knowledge for my career. Not everyone have to do this and can have very successful software engineering career. You don’t even have to touch single line of C or assembly code or don’t have to know how OS scheduler work or what is semaphore. But I am pretty confident about that you will eventually hit a wall somewhere in your career. Progressing as a software engineer is somewhat related to how brick walls constructed, if you miss important or essential knowledge that knowledge is not going anywhere and stay with you for your whole career.

Thats why I am reading tons of books for lately. Really, learning new high level language is meaningless in the long run if you have no idea how function call stack work or how really compiler or interpreter work (really if you are asking yourself how compiler work you probably don’t know how it works). Doing cool websites is meaningless if you don’t grasp fully how http works. I have a list of topics that I am missing that I am currently ashamed of sharing with public (but probably will). Some of these actually dates back to my university days that I should probably deep dive into because lets face it no one will teach you how stack pointer advances inside function call or what base pointer is for, how AST forms.

Are there any disadvantages doing this? Definitely yes, because software industry somehow built on hypes and trends, you will probably miss couple of trains but I don’t mind it as long as I will be fine on the long run. When people ask me “hey did you check X framework” or “look how cool Y language is with all these fancy stuff” I only say “no”.

I call it back to basics because without complete basics or fundamentals you cant have solid grasp of higher level constructs. Another reason I call it back to basics because I was doing some of the stuff right (if you not count last 3-4 year). Here what I am not doing lately that I should be doing right away.

  • Stopped writing blog or technical writing
  • Learn new stuff, break new stuff, create usable products with them and again write about these stuff.
  • Socialize back with developer community which I have been missing a lot lately.
  • Read personal blogs not some motivation based bullshit medium posts.
  • Give back to FOSS. Really this is what I should be doing when ever I find time.

I started blogging at my 2nd year at university (back 2008), I get to know lots of people from my blog and lots of people recognize me just because they came across my blog. Its one of the ways software developers socialize themselves. Hope this post will be end of that era.

github_activity

I was constantly learning new things, created tons of stuff whether its mobile application, library, web application or product for users. And I was doing this before anyone realize that this stuff will be a trend or hit. Not saying I see things before people do just talking about having experience with native mobile development before its become trendy. Thats also fun part of it, no flamewars no fanboys just couple of early adopters and you.

I stopped going conferences both as a speaker or listener. Even though the reason for not going was not wanting to see same people in different events as a speaker and self promoter.

Very few people keep personal blog so I am not sure I can do this anymore.

Irc

I have very few contribution other than my own open source projects which is a shame. Really sticking with your own projects is so much different from constantly socializing in mail lists, hanging out at IRC rooms (which I still do thanks to irccloud) about maintainers/developers of the project and discussing changes. Because its a big public exposure you have to show quality and this pushes you to improve yourself. I think this is the most important reason why everyone recommend doing it.

C Specification Behavior Meanings

While reading C specifications you may encounter some behaviours that do not make sense to you unless you explicitly know what they mean. Here is couple of well knowns and their meanings.

Implementation Defined

This behaviour is implemented different somehow but design choices is documented so you can actually expect what will happen when you encounter one of the different implementatitons. ex. higher order bit propagation

Unspecified Behavior

This behaviour is implemented different and also do not have specific documentation about it. ex. what will be amount of memory allocated when malloc called with argument of zero.

Undefined Behavior

This behaviour tells us that consequence of doing this can result in different outputs. Lets say you have character pointer and initialized with character string literal (which have fixed memory location).

char *p = "hello world\n";
*(p + 1) = 'a';

Above code will compile without problem but changing string literal is undefined behaviour and output may not be what you expect. (on my machine it compiled but gave bus error, you may get hallo world\n)

C Memory Layout of Program

When C program loaded into memory it pretend like it own the hole physical memory. But this is just abstraction done by virutal memory. Virtual memory is combination of part of physical memory plus disk. When C program loaded into memory it compose of different frames and these frames may end up different parts of the memory and may swap in/out by OS when more memory by other processes.

Above is how C program lays out inside virtual memory but C programs also divide into different logical sections depending on the data scope and type.

  • TEXT SECTION
  • DATA SECTION

Text Section

Text section contains compiled machine instructions. This is read only memory segment and sometimes known as code section

Data Section

Data section as name suggest contains data of the program and divide into different parts.

  • Initialized Data Segment
  • Uninitialized Data Segment / BSS
  • Heap
  • Stack

Initialized Data Segment

This is where explicitly initialized global, static variables hold. (All file scoped variables initialized but this section for explicit ones)

Uninitialized Data Segment

If you dont touch any static or global variable it will be given default values zero for regular types and NULL for pointer types. (literally binary zeros) Because its given binary zeros it does not actually takes up disk size in object file. This section also called BSS for historical reasons. (Block Started By Symbol)

Heap

This is where dynamic memory is allocated and grows downward (to higher memory address). Dynamic memory allocation done by calls like malloc, calloc or realloc family.

Stack

Stack is where local variables, parameters etc (mostly function related stuff) resides. When some function called its stack frame is pushed onto stack and when its done stack frame popped off from stack in LIFE manner.

Here is higher level picture of memory layout.

Using Etcd Cluster For Service Discovery

What is etcd?

Etcd Logo

Etcd basically (which is part of the coreOS project) distributed and reliable key-value store. Its kind of cache but built for something else in mind. Etcd becomes handy when you try to make sharing configuration less of a pain or try to automate service discovery business.

Because its highly integrated with coreOS and kind of new tech when you compare with old alternatives like zookeeper its hard to find uses cases and detailed documentation. Even though for fairly new project documentation is really good I find them rather unpractical so I decice to wrote my experience with it.

Use case

Even though you write decoupled and independent services eventually they need to know each other in order to delegate their tasks. But these services must have dynamic environment, they should probably deployed ten times a day or more and its not uncommon to change their infrastructure/dns/ips etc. How do they know each other so that when X service need Y service?

Writing Y service info statically into X service may be a solution to this but what will happen if you grown to 20 service? This situation will start to get out of hand. Also you need to update all of those configurations when you update one of the services say with new dns. Clearly we need some kind of centralised mechanism.

What we need

  • Distributed storage
  • Reliable
  • Clusterable?
  • Would be nice if it would be easy to setup
  • Need to access to that storage via HTTP may be + REST?
  • Security and SSL

Yes etcd provide all of these features out of the box so thats why I chose it for service discovery.

Setup

Its easy to setup etcd in most of the knowns systems you probably test it on some unix variant. I will assume you have ubuntu but others would be similar. Etcd release page have good documentation.

curl -L  https://github.com/coreos/etcd/releases/download/v2.3.0-alpha.0/etcd-v2.3.0-alpha.0-linux-amd64.tar.gz -o etcd-v2.3.0-alpha.0-linux-amd64.tar.gz
tar xzvf etcd-v2.3.0-alpha.0-linux-amd64.tar.gz
cd etcd-v2.3.0-alpha.0-linux-amd64
./etcd

Managing Etcd Processes

There is not much documentation about how you can handle etcd processes but because I have worked with supervisord in the past and know it well I decide to use it for managing etcd processes. DigitalOcean have nice documentation about how to install supervisor.

sudo apt-get install supervisor
service supervisor restart

Now you have supervisor up and running. Supervisor need to know which program you wanted to run and parameters while starting process. So it expects a config file namely supervisord.conf. Here is mine for etcd.

# -*- conf -*-
[include]
files = *.supervisor

[supervisord]

[supervisorctl]
serverurl = unix:///srv/www/supervisord.sock

[unix_http_server]
file = /srv/www/supervisord.sock

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[program:etcd-service]
user=ubuntu
process_name = etcd-infra1
command = /usr/local/bin/etcd
  -name infra1 -data-dir /srv/www
  -initial-advertise-peer-urls http://10.10.1.10:7001
  -listen-peer-urls http://0.0.0.0:7001
  -listen-client-urls http://0.0.0.0:4001
  -advertise-client-urls http://10.10.1.10:4001
  -discovery https://discovery.etcd.io/a79907dce59fe2047475302bc3f7be4c
  -discovery-fallback exit
autostart=true
autorestart=true
stderr_logfile=/srv/www/long.err.log
stdout_logfile=/srv/www/long.out.log

You can start supervisor with your own custom configuration file if its not in the default location.

supervisord -c /path/to/your/config.config

Supervisor have handly tool to check state of the processes, start/stop and restart them. Its called supervisorctl. Lets tell supervisor to reread config and restart all processes so that it can manage etcd.

supervisorctl reread
supervisorctl restart all

Now you have etcd up and running and managed by supervisor. You can directly access it through your public url. I will assume it will be 10.10.1.10 and access rest api via curl or browser with the path /v2/keys/.

curl http://10.10.1.10:4001/v2/keys

Done, you should have json indicating that this is directory but because you didnt have stored any key or value its only thing you got right now. Dont worry you will have key values here as soon as you write something.

Now cool thing about etcd is key expiration. You can set expiration or ttl to keys so that after given ttl time key will be expired. This will keep key-values up to date and allow old ones to expire. This also means that clients will need to constantly update their values to etcd cluster otherwise they will be inaccessible (or non existing in our case).

Clustering

Ok its great to have one etcd instance up and running but you cant rely on one instance. If its gone your services have no way of communicating so we need backup plan. This is where etcd clustering comes handy.

note that etcd uses raft algorithm which is kind of sophisticated and hard to understand algorithm but in short we need at least 3 node cluster in order to have reliable communication.

Now create two more machines with the same configuration and use supervisor to manage their processes just like the first one. I will wait here patiently until you are done…

10.10.1.10
10.10.1.11
10.10.1.12

Now you have total 3 independent instance of etcd we are nowhere close to being a cluster. In order to have cluster, nodes need to aware of each other. You have two option here you can assume their ip’s and other stuff wont change and give static configuration or take the fun path and use automatic discovery.

Auto Discovery

We need to start etcd instances with following commands and before that need to create discovery url from etcd public discovery services.

curl https://discovery.etcd.io/new?size=3
https://discovery.etcd.io/a79907dce59fe2047475302bc3f7be4c

You will have some kind of unique url after calling discovery service. This is where each etcd client advertise and know each other. You can assume some kind of metadata sharing will happen through this discovery url. Now start each etcd node.

Give each node a different -name otherwise you wouldnt get what you expected and they will probably override each other.

command = /usr/local/bin/etcd
  -name infra1 -data-dir /srv/www
  -initial-advertise-peer-urls http://10.10.1.10:7001
  -listen-peer-urls http://0.0.0.0:7001
  -listen-client-urls http://0.0.0.0:4001
  -advertise-client-urls http://10.10.1.10:4001
  -discovery https://discovery.etcd.io/a79907dce59fe2047475302bc3f7be4c
  -discovery-fallback exit

command = /usr/local/bin/etcd
  -name infra1 -data-dir /srv/www
  -initial-advertise-peer-urls http://10.10.1.11:7001
  -listen-peer-urls http://0.0.0.0:7001
  -listen-client-urls http://0.0.0.0:4001
  -advertise-client-urls http://10.10.1.11:4001
  -discovery https://discovery.etcd.io/a79907dce59fe2047475302bc3f7be4c
  -discovery-fallback exit

command = /usr/local/bin/etcd
  -name infra1 -data-dir /srv/www
  -initial-advertise-peer-urls http://10.10.1.12:7001
  -listen-peer-urls http://0.0.0.0:7001
  -listen-client-urls http://0.0.0.0:4001
  -advertise-client-urls http://10.10.1.12:4001
  -discovery https://discovery.etcd.io/a79907dce59fe2047475302bc3f7be4c
  -discovery-fallback exit

After initial cluster starts when you browse discovery url you will have machines id listed in json, can query cluster health and check which node is the current leader and all through rest api.

curl https://discovery.etcd.io/a79907dce59fe2047475302bc3f7be4c

Yes seems like our cluster is working. Etcd uses some kind of leader election algorithm so when you start running 3 instances one of them would be elected leader and other two of them voted. But as soon as majority to elect new leader is gone your cluster would be in a bad shape. Lets say your two instances goes down and you have one healthy instance running but one etcd instance is not capable of electing itself as leader so it will wait until some other nodes comes along and they will both select new leader. Etcd cluster documentation is kind enough to tell you about what you should do when your cluster health is in bad shape. Here is how election goes between nodes when you have two nodes in cluster

2015-11-11 16:01:51.010358 I | raft: 5fbc3c7fda7c7d62 is starting a new election at term 56
2015-11-11 16:01:51.010402 I | raft: 5fbc3c7fda7c7d62 became candidate at term 57
2015-11-11 16:01:51.010411 I | raft: 5fbc3c7fda7c7d62 received vote from 5fbc3c7fda7c7d62 at term 57
2015-11-11 16:01:51.010419 I | raft: 5fbc3c7fda7c7d62 [logterm: 56, index: 5091] sent vote request to 65b01f74f8276833 at term 57
2015-11-11 16:01:51.010433 I | raft: raft.node: 5fbc3c7fda7c7d62 lost leader 65b01f74f8276833 at term 57
2015-11-11 16:01:51.304971 I | raft: 5fbc3c7fda7c7d62 [term: 57] ignored a MsgApp message with lower term from 65b01f74f8276833 [term: 56]
2015-11-11 16:01:51.323104 I | raft: 5fbc3c7fda7c7d62 [term: 57] ignored a MsgApp message with lower term from 65b01f74f8276833 [term: 56]
2015-11-11 16:01:51.323263 I | raft: 5fbc3c7fda7c7d62 received vote rejection from 65b01f74f8276833 at term 57
2015-11-11 16:01:51.323353 I | raft: 5fbc3c7fda7c7d62 [q:2] has received 1 votes and 1 vote rejections
2015-11-11 16:01:52.210290 I | raft: 5fbc3c7fda7c7d62 is starting a new election at term 57
2015-11-11 16:01:52.210332 I | raft: 5fbc3c7fda7c7d62 became candidate at term 58
2015-11-11 16:01:52.210338 I | raft: 5fbc3c7fda7c7d62 received vote from 5fbc3c7fda7c7d62 at term 58
2015-11-11 16:01:52.210346 I | raft: 5fbc3c7fda7c7d62 [logterm: 56, index: 5091] sent vote request to 65b01f74f8276833 at term 58
2015-11-11 16:01:52.217618 I | raft: 5fbc3c7fda7c7d62 received vote rejection from 65b01f74f8276833 at term 58
2015-11-11 16:01:52.217756 I | raft: 5fbc3c7fda7c7d62 [q:2] has received 1 votes and 1 vote rejections
2015-11-11 16:01:53.446317 I | raft: 5fbc3c7fda7c7d62 [term: 58] received a MsgVote message with higher term from 65b01f74f8276833 [term: 59]
2015-11-11 16:01:53.446560 I | raft: 5fbc3c7fda7c7d62 became follower at term 59
2015-11-11 16:01:53.446651 I | raft: 5fbc3c7fda7c7d62 [logterm: 56, index: 5091, vote: 0] voted for 65b01f74f8276833 [logterm: 56, index: 5092] at term 59
2015-11-11 16:01:53.451726 I | raft: raft.node: 5fbc3c7fda7c7d62 elected leader 65b01f74f8276833 at term 59