Browsed by
Author: Nikhil Wanpal

Starting Your Own IT Business In India?

Starting Your Own IT Business In India?

This is the Current State Of Processes Around It!

I recently started working on a foreign software development contract. It seems that to ‘earn’ via such a contract, you need to have a business entity of sorts. I had to undergo a long, painful process riddled with unknowns and uncertainties to establish one and raise my invoices. Here is a brief of my understanding of the process and various norms and laws. I do not claim that the material below is correct, it is only how I understood it and would recommend you to not follow this blindly. You should, as I have, take help of professionals in this area (Chartered Accountants or CA, Company Secretary or CS and lawyers) before you proceed and form your own understanding. This is write up is merely a record of things I went through.

GST and stuff

  • First things first, link your mobile number (and email id) to Aadhaar number. By the way, linking your aadhaar to mobile and mobile to aadhaar are two different things. First keeps your mobile working. Second allows your to verify your Aadhaar number when asked for. Those who have registered for Aadhaar in the recent times, might already have their number linked. You can verify this on the Aadhaar portal. This currently is a big pain as all the private e-Seva Kendra are closed and government centres are flooded. Solutions:
    • Get in a queue at a bank or post office where they are still modifying Aadhaar at about 5 am on a Monday to get your appointment for modification. You have to then be present with all (don’t know which all) possible documents at the time and date of your appointment to get the modifications done. Else, get a new token by standing in queue.
    • Go to a remote village’s regional office to get it modified.
  • Register your business. In any way possible, but do register. Proprietorship is probably the easiest way to go forward, in a proprietorship, you and the business are the same entities and are inter-changeable.
    • It is good to name the business. Even if you name it as your name, i.e “Nikhil Wanpal” in my case. CA/CS would suggest to add something like “& co” or “& associates” etc in the name for clarity and avoiding confusion in legal matters. But this will bite us back when opening current account. If you decide to go for a proprietorship, here are the documents required:
      • Self signed copy of pan and Aadhaar of owner of business
      • Copy of latest electricity bill of premises
      • Copy of NOC for use of premises
      • Photo of name plate along with owner (Of course we cannot create a board, so take print out paste on your home door and take your photo in front of it)
      • Mobile no and email ID
      • Brief business activity
    • Many other forms exist: OPC, Partnership, LLP and pvt ltd. Choose one with least hassles. All other forms, either require another person to be part of the business and hence a dependency or else require a repetitive documentation to be maintained like BOD meeting MOMs. Also, when your business is a separate entity, you have a separate PAN and you need to be extra careful when making transactions, even by mistake, making a business transaction with your personal card (or vise a versa), linked to your personal PAN can be an issue. You need to ensure you carefully use correct cards for correct type of expenses. This is not an issue for larger organisations where there are people managing purchases and granting expense claims, but for a single person working, to me, this seemed like an unnecessary hassle.
  • Get your Shop Act License. This step is same as above, basically, a shop act is how you register a proprietorship. Shop act license is a license from government to perform business, and is ideal for shops/merchants and works well for us as small businesses as well. There were amendments to this license, in Dec 2017. The portal to avail the license was closed till 22 Feb, 18 for modifications. And when the portal opened, it opened with a new format of the license. Under the new act, a shop with less than 9 employees does not need the license, if you apply it does not need verification, nor does it need renewal of license. This is a good change for us, but is a pain since it is not yet propagated through the system.
  • Get your rubber stamps. Required on almost all documents you sign. A designation/authority stamp, an address stamp and a round stamp are minimum of those required. Also, buy a blue/violet stamp-pad.
  • Get your Udyog Aadhaar. This is a new self declaration of business required since Jan, 18. It is a supportive document for shop act license. Having a valid mobile number linked to your Aadhaar is mandatory to get this. In theory, this could ease your current account pains.
  • Get your GSTIN. GST is mandatory for business in India, especially if your turnover is more than 20 lakh in a financial year. You have to file GST for every month, by 20th of next month. If you can get the name of the business in the GST, it can help simplify further processes. GSTIN and PAN are linked to entities, and in proprietorship there is only but one entity, you. So your GSTIN will be linked to your PAN. There is no easy way to get a GST number for your business, unless you go for a ‘Company’ registration. Apparently, there is a way to add a business name in your personal GST. But as it happens, what if I choose to open a another business? Or rename my existing business? GST modification is not on-line yet and requires physical document submission. I did not know this, I got GST way before shop act and so my certificate does not have the name of the proprietorship in it.
  • Open a current account in the name of the business. Apparently, we, the IT professionals, do not fall under the group of professions that are allowed to do business using a savings account and hence a current account is mandatory, do not yet know of the implications. Now this part is tricky:
    • Shop act is no longer mandatory for business smaller than 9 employees, and hence the format is modified and no longer includes an expiry nor verification stamp.
    • Udyog Aadhaar is not a registration, but a self declaration (which is also what the new shop act has functionally become, given there is no verification).
    • Unfortunately, banks have not yet received a “Government Mandate” to use this shop act and hence this document is not considered as a legal / valid doc to open a current account. I have visited and inquired with these banks already, all would say they will open the current account, sure, but when you show them the shop act they will not recognise it and ask for a ‘real shop act’. List of banks I have visited: ICICI, Axis, SBI (they almost kicked me out), Saraswat cooperative bank, Maharashtra bank, Kotak Mahindra, PNB (total 7). A local cooperative bank was the only one that could recognise the new format and open a current account for me. (This issue may be limited to Maharashtra region)
    • In the last couple weeks, I have heard of HDFC bank offering a current account with a declaration from a practising CA along with other documents. I have not verified this.
    • From this step onward you need your stamps everywhere.
  • Get a registered digital signature. Contact a Company Secretary who can get this for you. Go for longest term and both signing and encrypting capabilities. Format for IEC has changed, it is now an online document and does not have a physical stamp or signature. Officials may have to be convinced for this. Required Documents: Refer next point, documents list is inclusive of both.
  • Get your IEC issued by DGFT. IEC stands for an Import-Export Code. We need this to make our export of software legal. To be able to get IEC, we need the digital signature and company name. IEC cannot be issued in the name of an individual. A CS can do this for you. It requires the form to be filled physically then submitted online and then offline, physically at their office. IEC requires a cancelled cheque in the name of the business, is the reason why you need a current account in the name of the business in the first place. Go for export permission for everything, because you can: manufacture, retail and service. Needs verification: Even if you do not fit the requirement to get GST, you still likely need to get IEC to make your export of software/service legal. Required Documents:
    • Photograph (3x3cms) of Applicant
    • Copy of PAN card and Aadhaar Card of applicant
    • Cancelled Cheque bearing the applicant Entity name, IFSC code
    • Mail ID and Contact no. of applicant
    • Copy of PAN Card of company, if applicable
    • Digital Signature of applicant with password
    • Copy of COI, MOA and AOA of Company, if applicable
    • Confirm any one of the business activity for IEC. (Merchant Cum Manufacturer Cum Service Provider)
  • Get your LOU/LUT: Letter of Undertaking. (Officially called LUT on the GST portal) This says that you are doing an export and are not liable for any GST, and if you are you would pay it. This does not exempt you from filing GST, but the direct payment of it from your income. You still have to file your GSTs and declare expenses and GST you have paid. Unless you have LUT, you have to submit your income and pay (18%) GST on it. Then you can request a refund later, when you get your LUT. My CA was generous here and suggested I also grant him a letter of authority, so I would not have to go to the GST office stand in queue myself. It takes 2 days for Digital Certificate and 2-3 days for IEC to be generated. The process for LUT has changed and the submission is now online. The same documents above are to be scanned and uploaded to the portal. Also, there is no longer a physical document or any document provided for LUT. An acknowledgement number is generated when uploaded documents are accepted. This generation of acknowledgement number is considered as ‘acceptance’ and ‘grant/approval’ of LUT, and you Ack number itself is your LUT number. This is communicated via government circular (# 40/14/2018-GST) dated 6th April. This may be a hassle, if your client does not accept or is unaware of the new LUT process. In either case, you also need a LUT declaration on your invoices.You should not have any income in the duration from when you get your GST to the time you get your LUT, in other words, get . Required Documents:
    • Form No RFD-11 on firms letter head.
    • LUT on Rs 500/- stamp paper
    • Copy of GST registration certificate duly self attested.
    • Copy of Import Export Code certificate issued by DGFT.
    • Declaration on your letterhead that you are not liable for any contravention under GST act.
    • Copy of address proof of your entity duly self attested.
    • Copy of your PAN and address proof duly self attested
    • Authority letter in favour of CA on Rs 500/- stamp paper to represent on your behalf.

Pro-tip: Close your stamp pad and pack your stamps immediately after use, to avoid damaging your documents. 😉

Generating Invoices

Once you have all the docs setup, all income has to be justified with a GST Invoice letter, generated, signed, stamped and sent to the client on or before the date of payment. There are many formats floating around for GST invoices, there is also a government issued format, but there is nothing I could readily use. There are companies like Zoho who have an invoicing software with GST support. There are many other But I found it unnecessary when raising so few invoices for my purposes, I went with a Google Sheets, created a custom format which includes all the fields required in official government format and is more suited for our (IT) business. Here is the link.

Feel free to copy and use it for your own business purposes. I would love a shout-out if it helped though. 🙂

The invoice is marked with text where you need to modify and fill in values relevant to you, like your business name, client name, GSTIN etc. Follow the text steps in the invoice itself to get your own personal invoice. If you are billing hourly, you might want to fill in hours, minutes and hourly rate and use this formula in amount field:

=(SUM(F17*60+G17)/60)*H17

It converts the hours into minutes, adds the minutes, converts them back into hours in decimal and multiplies by hourly rate to calculate total amount. No other change should be required. When looking for SAC/HSN codes, I found 2 that could apply to regular IT development and have included them both in the template.

Filing Income Tax Returns

Someone might suggest you something apparently amazing to save tax, something called ‘presumptive tax’. In my view and understanding, there are a lot of practical issues with this scheme.

  • Presumptive tax is a self declared expense/income from your turnover without being forced to audit your business.
  • The allowed percent is 50% for an individual/proprietorship/professional. (8% for business)
  • It means: you can self-declare 50% losses/expenses, without any documentation or computation of income or balance sheet and it would still be considered a legal tax filing under presumptive taxation. i.e. you would be liable for tax only on 50% of the income.
  • You should actually have that much business expense. Using this scheme for invasion of tax is not advisable.

There may be people who would suggest otherwise and it might sound good. Here are some thoughts for those who intend to follow this route:

  • You cannot use the extra money in any form. If you are found to have invested, say 1 lakh in a mutual fund and income tax office asks for an explanation say 5 years later, you would have no way to proving legal income.
  • Say you falsely claimed 50% as a business expense, (and used it for personal expense) it would be illogical, since it is expected to have happened from the remaining 50% of your income.
  • Say are doing nothing illegal, and your actual expense is 50%. This scheme is new, and hence there is little prior knowledge or cases to show what can be done to ‘escape’ from such inquiries if they happen and hence I (and my CA) are not in favour.
  • Needs verification: This scheme locks you in for 5 years, i.e. you cannot change the way you show expense for 5 years once you opt for this.
  • All banks and legal firms ask for balance sheet and computation of income of minimum 3 financial years (more on this below).
  • That being said, you cannot get any loan for the next up to 8 (5+3) years since you have no (or not enough) balance sheets to show. You cannot keep a balance sheet or computation when filing under presumptive tax, as presumptive tax scheme exist to exempt you from having to maintain balance sheets. If you have balance sheets, which means you have kept books of accounts which means you cannot opt for presumptive tax.
  • Under presumptive tax, even if a bank agrees to give you loan, it will be only on your 50% income.

I would rather show actual expenses and depreciation of business assets the traditional way. Well (Unfortunately) for us IT folks, the income is almost always via bank transfers, and the expenses are limited and again mostly via bank/card payments and so almost everything is anyways documented. It means I end up paying a lot more tax as opposed to filing under presumptive tax, but I am okay with it.

Some Points on Loan applications

As the situation was for me, I also had to apply for a loan. Ah, that in itself is a long story but some takeaways are:

  • As a business person, you cannot get a loan unless you have been in business for a minimum of 3 years and the business has been ‘in business’ for 2 years.
  • You need to present books of accounts, balance sheets, and computation of income for each business year of the last 3 years and the current year. The sheets and books for current year will be tentative, but that is still okay.
  • Your income for granting loan is considered as ~70% of average of the last 3 years of income. Bank may add higher ‘safety’ margins on it as they see fit.
  • If a bank grants you loan, you will always have a rate of interest at least 0.2% higher than the normal interest.

 

Again, iterating that the process and steps documented above are what I had to go through. There were a whole lot of unknowns and uncertainties that I had to go through to come to these conclusions and what you see above are the conclusions alone. One of the primary reason for the unknowns was that the processes had changed very recently, (I was almost always one of the first clients of the CA/CS/lawyers I was consulting since the change in process as if my timing matched with the changes!) causing uncertainties. But almost all the changes are of making the processes offline -> online, complex -> simplified, physical -> automated, multiple approvals -> less or no approvals; which I see as a positive thing, leaving aside the fact that I was caught in the transition. I hope when you need to implement this, the processes will have been simplified further!

Mute Mic With Keyboard Shortcut On Ubuntu Or Linux Mint

Mute Mic With Keyboard Shortcut On Ubuntu Or Linux Mint

Here is a quick tip for all the automation buffs like me. Turn your mic on and off with just a keyboard combo.

I do all my work remotely. Which is also to say I have a lot of conference calls. And like you, I hate it when people do not mute their phones / mic on laptops when not speaking! (cue in the obligatory meme about not putting your phone on mute during a call!)

I always wished for a hotkey of some sort to mute / un-mute myself during a call. So here is a way to do it.

  1. Use a Linux machine. (This in itself is a great tip! 😉 ) These steps in particular are for an Ubuntu / Linux Mint machine.
  2. Put the following snippet in a bash script file and add it to path. You can also define it as a bash alias and load it from your custom bash profile, but then assigning it a shortcut may not be that easy.
  3. Set a keyboard shortcut to trigger this script. I use Meta+M for this.

Here are the two variants of the script for that:

This script toggles mic state, shows a nice (transient) notification of the changed state, with an intuitive icon! It should also replace previous notification quickly, but somehow it does not seem to work yet.

(cue in the meme about speaking on mute! ;))

CIDR Explained in Layman Terms and Decimal Numbers

CIDR Explained in Layman Terms and Decimal Numbers

If you work on cloud it is likely that you have used those numbers and slash that follow the IP addresses. The documentation points to something called CIDR. It is said to be super helpful, and awesome standard adopted by internet that extended life of IP4 . But have you tried searching ‘what is CIDR’? It is all jargon, all of it. There is hardly any lay-man friendly explanation of the term. Even Wikipedia has managed to find a complex way of explaining it. And yet it is something we use everyday, especially if you are working on cloud, containers or container orchestration frameworks.

We use it for defining networks when using docker. We use it when we specify services and networks in orchestration frameworks like swarm, compose, kubernetes, ECS or GKE. We use it when we specify ingress/egress rules in an AWS security group. We use it when we create a subnet in AWS EC2 specification, when we define VPCs, when we define clusters. Even single IP addresses (range size 1) at times are defined using CIDR notation.

I was wondering if CIDR could be explained without getting into binary number calculations, or more jargon of classes or routing, and found one.

CIDR is simply a way of specifying range of IP addresses. In cloud we mostly deal with IPV4 addresses and so let us see how we can think of CIDR in IPV4 context.

An IP address has 4 parts, joined together by dots. Each part can have 2 ^ 8 = 256 values, between 0 -255 both inclusive. In CIDR, we add a slash after the IP followed by a number between 1 to 32, both inclusive, these numbers are in fact a netmask specification. Now these 32 numbers can be divided in 4 groups of size 8, similar to the IP address (groups being 1-8, 9-16, 17-24 and 25-32). Each group has effect on the corresponding section of the IP address to generate a range. Like in the diagram below:

Now looking at the number in the group you can quickly tell what all IPs can come as part of the range. for example:

  1. 99.123.43.64/8 –> 99.0.0.0 to 99.255.255.255
  2. 99.123.43.64/16 –> 99.123.0.0 to 99.123.255.255
  3. 99.123.43.64/24 –> 99.123.43.0 to 99.123.43.255
  4. 99.123.43.64/32 –> 99.123.43.64!

The size of the range decreases as this number goes up, 1 being widest and 32 being strictest. Simple enough?

Now on to a little more complex part, what about number that are not multiples of 8? You can certainly define something like: 99.123.43.64/18 or 99.123.43.64/5 or 99.123.43.64/27 what would that mean? We have seen that each group of netmask governs IP values (0 – 255) in its group (and the groups that come after it). What if we divided these groups further? Larger groups were with multiples of 8, we will now divide the 255 numbers in 8 different ways, using 8 powers of 2, using a little 10th grade maths to do this:

  1. 7th power of 2, i.e. 128 creates two sub-groups: 0 – 127 and 128 – 255
  2. 6th power of 2 i.e. 64 creates four sub-groups: 0 – 63, 64 – 127, 128 – 191, 192 – 255
  3. 5th power of 2 i.e. 32 creates eight sub-groups: 0 – 31, 32 – 63, 64 – 95, 96 – 127, 128 – 159, 160 – 191, 192 – 223, 224 – 255
  4. 4th power of 2 i.e. 16 creates sixteen sub-groups: 0 – 15, 16 – 31, …. 240 – 255
  5. 3rd power of 2 i.e. 8 creates thirty two sub-groups: 0 – 7, 8 – 16, …. 240 – 247, 248 – 255
  6. 2nd power of 2 i.e. 4 creates sixty four sub-groups: 0 – 3, 4 – 7, …. 248 – 251, 252 – 255
  7. oneth power of 2 i.e. 2 creates one hundred and twenty eight sub-groups: 0 – 1, 2 – 3, 4- 5, … 252 – 253, 254 – 255
  8. zeroth power of 2 i.e. 1 creates two hundred and fifty six sub-groups: 0, 1, 2, 3, 4, 5, …. 253, 254, 255

With me so far? Now let us see how we can understand the meaning of intermediate numbers:

  1. Step 1: Identify the larger group your netmask belongs to using the diagram above, call it major group. Ex: /18 belongs to group 3 (17-24) and /30 belongs to group 4 (25-32)
  2. Step 2: Deduct your netmask number from the higher bound of the group. Ex: with /18, you get 24-18 = 6 and if /30 you get 32 – 30 = 2; this is your power of 2 (say n).
  3. Step 3: Now you can calculate the number of IPs that fall in this range by nth power of 2. Meaning when you specify /18, you have 2^6 = 64 and for /30 you have 2 ^ 2 = 4 possible values in the range.
  4. Now its just a job of identifying which sub-group your number in the place of major group fits in and you have the exact range of IPs that fit your CIDR. Ex: /18, means group 3 and sub-group with power 6. So the number in 3rd group in an ip like 99.123.43.64 is 43 and it fits in 1st sub-group.

Let us look at some examples:

  1. 99.123.43.64/27
    1. 27 is in group 4 (25 – 32)
    2. There are 32 – 27 = 5 and 2 ^ 5 = 32 addresses and we pick from the 5th power sub-group.
    3. and the 4th group number, 64 falls in 64 – 95 subgroup. so the range is: 99.123.43.64 to 99.123.43.95
  2. 99.123.43.64/30
    1. 30 is in group 4 (25 – 32)
    2. The power would be: 32 – 30 = 2 and there would be 2 ^ 2 = 4 addresses.
    3. and the range would be: 99.123.43.64 to  99.123.43.67
  3. 99.123.43.64/18
    1. 19 is in major group 3 (17 – 24).
    2. The power would be: 24 – 18 = 6 and there will be 2 ^ 6 = 64 possible values in group 3. Applying all possible values in group 4 for each in group 3, gets us: 2 ^ 6 * 2 ^ 8 = 16384 addresses
    3. and the range would be: 99.123.0.0 to 99.123.63.255
  4. 99.123.43.64/13
    1. 13 is in major group 2 (9 – 16)
    2. The power would be: 16 – 13 = 3 and there will be 2 ^ 3 = 8 possible values in group 2. Applying all possible values from group 4 and 3, we get 2 ^ 3 * 2 ^ 8 * 2 ^ 8 = 524288 addresses
    3. And the range would be: 99.120.0.0 to 99.127.255.255

I hope that clarifies it a bit!

[TechNggets] Episode 2: Intro to Git Flow

[TechNggets] Episode 2: Intro to Git Flow

This is the second episode of podcast “Tech Nuggets and Thoughts”.

 

Some docs on git flow:

  1. The blog that brought it to us: http://nvie.com/posts/a-successful-git-branching-model/
  2. git flow scripts project discussed in podcast: https://github.com/petervanderdoes/gitflow-avh
  3. A superb cheat sheet for git flow: https://danielkummer.github.io/git-flow-cheatsheet/

 

To get updates, you can subscribe to the podcast on: Apple iTunesplayer.fmRSS feed. If you have any suggestions, thoughts or recommendations, please feel free to comment below. You can also reach me on podcast’s twitter handle @TechNggets or my personal account @nikhilwanpal.

(If the fancy player above does not work, try the bare bones player below.)

Hello world, again! (Broken links)

Hello world, again! (Broken links)

I have moved by blog over from blogger to WordPress!

As these two platforms are not exactly compatible,  old links are broken, please give me some time to fix them. I am working on it.

How do I access the old blog pages?

Meanwhile, you can:

  1. Try replacing the year and date between the domain and the actual blog you are looking for with link with ‘/blog/’.
  2. If you are logged in to WordPress, the blog may show you a search button in upper right corner. In that case try searching for the blog.

Please feel free to use comments on this page to report broken links.

[TechNggets] Episode 1: Intro to Containers and Self

[TechNggets] Episode 1: Intro to Containers and Self

Here is the first episode of “Tech Nuggets and Thoughts”

In this episode we talk about containers, what they are, how they work, what docker is and when / when not to use docker.

 

To get updates, you can subscribe to the podcast on: Apple iTunesplayer.fmRSS feed. If you have any suggestions, thoughts or recommendations, please feel free to comment below. You can also reach me on podcast’s twitter handle @TechNggets or my personal account @nikhilwanpal.

(If the fancy player above does not work, try the bare bones player below.)

 

We need to talk, says one microservice to another

We need to talk, says one microservice to another

‘But how?’ asks the other service!
Ever wondered how we communicate? One would not believe how complex and multi-step process it is. It involves some very complex terms like perception, encoding, medium and decoding. Let us take a look at a diagram explaining this:

So what is the relation with microservices? Communication between two services is not much different. It follows through a process very similar to this, in fact it can be explained with the exact same steps! Consider two services a ShippingService written in Java requesting the preferred address of a user identified by Id, from a UserAddressService over REST a call:

Okay, so human communication can be used as an analogy to understand inter-service communication in a microservices stack. So what? So basically, it tells us that the scenarios of failure can also be same and we can understand microservice communication by relating to human communication.

Why

Before we deep dive, we should quickly consider why it matters at all in microservices. It is basically the difference between inter-process and intra-process communication. Consider this, intra-process, meaning communication between components of a single process, is like talking to one-self. You understand yourself, well (usually!), at least there is a whole less ‘miscommunication’ when talking to oneself. This is the kind that happens in monoliths, and why it was not so much of an issue until we started working on microservices, which have inter-process communication. To relate to this consider talking to people instead, known or unknown (API, authentication etc), people of different race, origin, nationality (perception), at a distance or near you (response time), on a phone vs in person (medium/network, protocol), direct vs indirect or at a different time (whoa! yeah, i.e. sync vs async, like leaving a note), speaking unknown language (encoding-decoding, JSON/XML/binary like gRPC), and of different background or upbringing (perception again) and in different mental states (well, stateless vs stateful services), telling them a secret or casual gossip (secure vs insecure). Now you know why working in teams is so difficult, and also microservices!

Types

Let us break down all the phases and see the types of potential problems. This is not some standard classification, but I have found this method effective when understanding the issues with analogies.

  1. Encoding: Failure to encode the object correctly by missing attributes or encoding with different name. For example, frameworks in java allow for choosing different attribute names in JSON/xml format, these are specified as strings and there is no real validation on these other than tests. Also how do we ensure if both services are using the same message structure? This can in part be ensured by sharing a common object model library. But then this goes against the principle of knowing models by views; not every services needs to know what all the attributes of a Customer object are! You might as well use queues for communication then.
  2. Sending the message: These include the whole set of issues which can occur due to not sending the message on correct protocol, address, port or path, the whole address related issues. Integration testing can not always help here, as some of these depend on the production infrastructure as well.
  3. Decoding: Being process is reverse of encoding and comes with all the same issues.
  4. Perception: Now these are some of the serious issues which you tend to miss even in unit and integration tests and can be caught only in later phases, when working against live services. If you encounter these, you can most certainly assume that apart from inter-service, you also have some team communication issues.
  5. Feedback: One of the most important step in the communication is the feedback, the acknowledgement of receipt of the message. There can be plenty of causes for a service to not respond, including all of the issues discussed above, and adding potential networking and issues related to health of the service.
  6. Cascading Failures: A much serious situation which is merely an outcome of failure to identify the issues in service communication and safeguarding against them.

Now that we are acquainted with the terms, I want to tell you a couple of stories; really scary, real stories.

Case study 1

I know of a Value Added Services (VAS) company, you know the services you never want yet your carriers charge you for, yeah, those. This company was one of the few trying to build a useful service you may want to pay for, trying to play fair and by the rules, even at a revenue loss. VAS industry suffers from considerable frequency of fraud and so it was said that the biggest crime is charging your users twice. Such incidents would be immediately flagged as fraud by the carriers and the service would be banned.

We had a call flow where a “Billing Service” makes a call to a “Carrier Integration Service” for charging which in tern makes a call to an external-internal Notification Service (a shared service hosted by a different unit) to send notification to user, after firing the charging request to the carrier. It worked fine for years, until one day this external-internal service suddenly slowed down, timing out all the calls from Carrier Integration Service, which in tern caused a time out on Billing Service, which treated it as a failure and ‘retried’ the billing call. The issue got flagged in minutes, but hundreds of users were charged multiple times that day before the team could respond.

There were a whole lot of things that went wrong here. There were feedback issues, which likely occurred due to network issue, giving rise to perception issues and finally causing a cascade. The worst thing though, was the cascade.

Case study 2

Another such story, not so grave, was when the Billing Service sent a request for partial charging, but the Carrier Integration Service denied honouring it. Both services used enums to identify the keyword used, both had integration tests and all worked fine. It was only after the services were tested against working instances of one another, in a pre-prod environment on a crunch day that it was identified. The issue was stupidly simple, both services used different value for the constant! Again a whole lot of things that went wrong here, most important was the perception issue caused due to the miscommunication between the teams working on these two services.

The Solutions

Discussing every step identified to fix the case studies we saw will be a story in itself, we shall discuss the first solutions applied to the most glaring causes in both cases. For the cascade, we made it a point that every service will implement Hystrix for every single inter-service communication call. Hystrix is a circuit breaker, meaning it wraps a chunk of logic (say a method) and on exception it can flag, throttle, block and bypass the method in question. The idea is to wrap the calls to external services, aka dependencies, and when something goes wrong give them time to recover by bypassing calls and safeguard the sender service from cascading the issue. We had hystrix is some services, but some teams had argued that it is an unnecessary complication for internal services and services that have been working fine. Well, that was before the incident. Everyone just jumped on it as the first change once we recovered from the impact. In my view, circuit breaker is a tool microservices should never be built without.  As an additional safety net, we also ensured that the Carrier Integration Service builds a temporary cache of all the users it processed and validate against it before it fired any call (Not every solution to any problem is purely technical!).

For the perception issue, we need to ensure that post encode-decode the receiver understands the same thing as the sender meant. We had hosted stubs against which we tested the services in automation testing phase. These stubs were dumb services implementing the same API as the service they stubbed. These were developed by whichever team needed them, essentially Billing Service team would never develop the stub for the Billing Service; which caused the discrepancy in the stub and actual service behaviour. This had to change, the team developing Billing Service was to be responsible for developing the stubs for Billing Service and team for Carrier Integration Service for it’s service stubs. This way, the stubs always perceived the same as the actual service did.

Now how do we address intra-team communication problems, anyone?

Google Tag Manager: Tag Priorities Vs Tag Sequencing

Google Tag Manager: Tag Priorities Vs Tag Sequencing

 

As most GTM (Google Tag Manager) users will agree, this is a much discussed yet confusing topic! The documentation on these topics is very concise and to be honest precise in describing what these two options do and what to expect, yet some of the side effects of these two options, combined with asynchronous nature of JavaScript are left out to be inferred by the users. And this is where much of the confusion seems to come from. Even the many blogs out there on this very topic barely touch this context.
Hence we are going to discuss this very thing today.

Tag Priority

Tag priority as described by the documentation is a number associated with a tag which identifies the order of firing the tag. Firing, not completion. Secondly, firing is itself an asynchronous process. If you consider all simple HTML tags, firing them meaning adding them to the HTML of the page, which is what the GTM script is responsible for. What tag priority means is that all the html elements will begin to be added in order identified by the Tag Priority but the GTM script will not wait for the elements added before to load and execute before adding next. (also it cannot know if it is finished, more on that later) And hence this does not govern the load order of the tags.

Tag Sequencing

Tag sequencing as described is a setting that governs which tags will fire before and after a particular tag. One can imagine this as a setup-run-cleanup processes, established with GTM tags (which is also apparent in the documentation). Think of it like a unit test; there is a setup which will fire before the tag (@Before in JUnit, or beforeEach in mocha), then the tag itself (@Test in JUnit or it("", ()=>{}) in mocha) and then the after-tag/clean-up tag (@After in JUnit or afterEach in mocha). If you were trying to make sure that a given tag fires before another, you should be happy sequencing exists.

Read the document carefully again and you will see it does not speak about completion yet again! Much like priority, sequencing cannot guarantee the completion of setup tag before firing the middle (test) tag! It will only ensure that setup tag is ‘fired completely’ before moving on to fire the middle / clean up tag.

In the Tag Priority documentation, it correctly states: “Tags will still be fired asynchronously (tags will fire whether or not the previous tag has finished.)” and “Tag Sequencing allows you to specify exactly which tags fire before and after a given tag.”. But says nothing of this sort in the documentation of Tag Sequencing.

By nature of JavaScript, it is difficult to know when execution of a particular snippet completes without explicit notification from the snippet; it can be in the form of an event being fired or a callback being triggered (Promise will come under this too); but as GTM does not ask for either, it cannot really know if your tag is ‘completed processing’.

We can see this by doing a simple experiment. In a GTM container, let us create 3 custom HTML tags as:

1. SetupTag:


<script type="text/javascript" src="https://code.jquery.com/jquery-1.12.4.min.js"></script>
<script type="text/javascript">
  sayOnDoc("'jQuery' object defined on page: " + !!window.jQuery);
</script>

<script type="text/javascript">
  jQuery(document).ready(function() {
    sayOnDoc("'jQuery.ready' fired.");
  });
</script>
<script type="text/javascript">
  document.addEventListener('DOMContentLoaded', function() {
  	sayOnDoc("'DOMContentLoaded' fired");
  });
</script>
<script type="text/javascript">
  var gtmName = 'google_tag_manager';
  var insideGTMAndDomReady = window[gtmName]
            && window[gtmName].dataLayer
            && window[gtmName].dataLayer.gtmDom;
  if (insideGTMAndDomReady) {
    sayOnDoc("'GTM ready' done.");
  }
</script>

<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/mathjs/3.16.5/math.min.js"></script>
<script type="text/javascript">
  sayOnDoc("'math' object defined on page: " + !!window.math);
</script>

<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/d3/4.11.0/d3.min.js"></script>
<script type="text/javascript">
  sayOnDoc("'d3' object defined on page: " + !!window.d3);
</script>

<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.6.5/angular.min.js"></script>
<script type="text/javascript" async=true src="https://cdnjs.cloudflare.com/ajax/libs/ag-grid/14.0.0/ag-grid.min.js"></script>
<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/underscore.js/1.8.3/underscore-min.js"></script>
<script type="text/javascript">
  sayOnDoc("'_' object defined on page: " + !!window._);
</script>
<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/backbone.js/1.3.3/backbone-min.js"></script>

<script type="text/javascript">
  var someScr = document.createElement('script');
  someScr.onload = function(){
	sayOnDoc("Moment js added to page");
  };
  someScr.src = "https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.19.1/moment.min.js";
  document.head.appendChild(someScr);
</script>

<script type="text/javascript">
  setTimeout(function() {
  	sayOnDoc("My timer timed out..");
  }, 500);
</script>

2. MiddleTag:

<script type="text/javascript">
  sayOnDoc("Middle tag fired");
</script>

3. CleanUpTag:

<script type="text/javascript">
  sayOnDoc("CleanUp tag fired");
</script>

And create an html file or jsFiddle like this.
Now we simulate both scenarios:

Priority Test

Setup: Have all the three tags triggered on ‘All Pages’. Set priority of SetupTag set to 20, MiddleTag to 10 and leave CleanUpTag empty or zero.

Observation: The Setup tag is the first to fire. Mid way during the execution MiddleTag and CleanUpTag fire, while the Setup tag is still to complete.

Sequence Test

Setup: Have only MiddleTag triggered on ‘All Pages’. Set CleanUpTag as the clean up tag and SetUpTag as the setup tag for the MiddleTag.

Observation: The SetupTag starts to fire, all the script additions from the setup tag are done first, then the MiddleTag fires and finally the CleanUpTag, almost as if the next tag waits for the previous to complete. Yet the asynchronous sections of the SetupTag fire way after the CleanUpTag!

Additional Observations to Note

Note that in both cases the ‘async’ scripts cannot be guaranteed to be executed in order. Also that the event ‘DOMContentLoaded’ is completely skipped, this is because the event happens before GTM starts firing at all. Also interesting to note that although jQuery is loaded before, ‘jQuery.ready’ triggers after the ‘GTM ready’ is written to document, meaning that although jQuery is loaded on the page, there is a delay before the ‘ready’ event is fired; only that the delay is not long enough for text to show up after Middle/CleanUp tags. In both cases the completely asynchronous snippets: adding a script to page and a timeout delay happen way after the CleanUp tag.

There is a whole lot to discuss on the script loading and execution in browser, and asynchronous nature of JavaScript itself, which we have not and cannot cover in this post. There are a whole lot of different things that can happen depending on how we write the code in the tags and the browser you load the tags on. But at least we know that we cannot rely on GTM blindly to sequence the tags, especially if there are any asynchronous components in them.

As a side note, there are a whole lot of different and interesting scenarios that arise when we combine the priorities and sequencing with Tag firing options.

Software development hygiene: Why do we brush our teeth?

Software development hygiene: Why do we brush our teeth?

Yes, why do ‘you’ brush your teeth?
Is it guaranteed that if we brush our teeth twice a day, floss once a day, gargle with an antiseptic, we will never have toothache or bad breath? And if we did not brush teeth say, for a week, would we be guaranteed to have toothache? For a few months, may be yes, we might, might just have to get some treatment done for a few teeth. So the question, why do we brush our teeth, daily?

And how did we start brushing the teeth? Were we born with a brush in one hand, toothpaste in other and with an utter, inexplicable desire to brush teeth every morning after waking up from sleep and before going back to bed? Assuming that no one would remember how they themselves were born, all parents at least will agree with me, that this is certainly not the case. So the question, how did we start brushing our teeth daily?

And now the question you might have in your mind: “What’s the point?”
Recently, a person on our team raised this question(s): Why do we have unit tests. I have been writing good code, good enough that QAs do not find any critical issues, nor has anything ever severely broken in production because of my changes, why should I write tests? If I could think of all scenarios to unit tests, why do we have dedicated QAs on our team? Why should I pass my code through a static code quality analysis tool? All these processes are slowing us down. I have worked without all these processes in the past and that has worked quite well, why do I need this overhead of processes?

I agree, I hate processes.
Yet we need to appreciate the importance of processes and acknowledge where they are required. Come to think of it, why does a process exist? Can we not work without processes and the overheads thereof? Short answer: No, we cannot. Long answer: We can, given that everyone on the team understands the core reasoning for the existence of the process being bypassed and takes the responsibility of upholding the goal of this process without strict adherence to the process itself.

Well, how did I start brushing my teeth daily? My mother would tell me: till I was a couple years old, she used to brush my teeth. When I became three, she taught me how to do it and would ask me to show how clean my teeth were. She would ask me: “Are they shining when you look in the mirror?”, I would go and check and say “Yes”. When I became four, she would just remind me to brush, and I think at five I had finally started brushing my teeth daily, without having to show her how clean they were. I do not believe your story would be very different than this. It took years of practice and perseverance of our parents to eventually get us to brush our teeth daily so that finally we could get rid of the ‘overhead of process’.

Yes, many processes can be chucked as long as the goal is achieved; but are we, as a team, responsible enough to make sure they are achieved every single time? Let us say we are, but are we ready to carry the burden of remembering every single code smell, every single potential bug and be mindful of it while writing code? Is that even humanly possible? If the answer to that question is yes, sure go ahead and chuck the quality analysis tools, unit tests, pull requests and code reviews; we don’t need them. But if the answer is no, wait till it becomes yes!
We can certainly bypass processes and get an apparent speed-up, but chucking a process before we are ready is sure to give us pain in the tooth (and in a few more places)!

Experience: Introducing JMockit To The Team

Experience: Introducing JMockit To The Team

Like many codebases out there, our codebase at work had a backlog on unit & integration tests and it was high time we covered it up. So one fine day, it was decided that we shall no longer accept code without tests. Then the question of ‘how do we write tests’ came up. As one of the architects on the team I introduced the methodology of unit/integration testing and a mocking library (JMockit) to aid in cases where testing could be difficult without one, conducted trainings and hands-on sessions for everyone on the team, set up a peer review process and we were ready. That is all there is to the history of the situation we are in. Well, we are talking about a large team here, some eighty people working on a large codebase as a whole, although divided in multiple microservices.

Today, almost every test we have has a mocked class, and an expectation set on some dependency. Many tests verify how many times a particular internal/private method or a method on a dependency was invoked, some have VOs mocked, or have assertions on return values inside verifications block. A few have gone to the lengths of mocking constructors of certain objects because there was no way to inject them into tested class. We are not even considering the private and static method mocks here.

When I look at the tests that we have today, and look back on the last few months, I wonder if I made mistake while introducing the mocking tool. In the trainings we had, we discussed the purpose of a mocking tool, issues due to overuse, indicators of overuse; heck there was even slide dedicated to this in the developer induction we have here. Architects were involved in many code reviews and tried avoiding these pitfalls, but clearly it was not enough.

Tests are supposed to improve the design of the code. Since the highly tangled, coupled classes, classes breaking SRP (Single Responsibility Principle) are difficult to test, we tend to fix them. As the size of the class increases, the functionality it has grows making it difficult to test, so we split it. As the dependencies being created makes it difficult to test, we change the class to allow injecting them. We end up splitting large methods, redesigning methods with side effects, removing unneeded code, decoupling from the libraries all for making the code testable and effectively get cleaner, maintainable, verifiable code. All of this, only if we wrote tests correctly.

If we started modelling our tests to match our code, we not only lose all the benefits, but the tests start becoming coupled to our code. That brings down the speed of refactoring or new development because every time we change code, we need to change the tests to match the implementation. That brings down our overall productivity. And finally blaming it on tests, we would stop writing them altogether. Back to square one! Mocking tools have a purpose, but if we mock everything, we get our tests tightly coupled to the implementation, adding to the problem. Mocking simple data-structures and VOs is not meaningful, we never test their methods separately, they are not supposed to be tested. (And yes, VOs are data structures, let us not get into that here.) Mocking external libraries is risky, because we then verify their actual behaviour only at runtime, which is exactly what we are trying to avoid by writing tests.

JMockit is a powerful tool, a little too powerful, and harmful if we are not careful. Despite the misuse that we have done, it could do all those things is itself marvelous. I am not convinced to blame it on JMockit, it is as Uncle Ben said to Peter Parker: ‘With great power comes great responsibility’. What we did, is our fault, we should have been more responsible with it. I wish I could go back and change the way we used it, or overused it, but our life is not a Git repository.

Luckily though, we have identified these issues and their severity before they start heavily weighing us down. What we need to do first is avoid more such tightly coupled tests from getting in. The course of action now seems like along with training the team on how to be more responsible with tests (which by the way needs to be a continuous process), we need to identify reviewers and train them on spotting such instances. Revising the review process to have reviews through identified reviewers and not just peer reviews.