What is blockchain?

Blockchain is…

a distributed
- and…
an immutable ledger

which facilitates...

a growing list of records (called blocks)

which are...

linked together with cryptographic hashes.

In short, it’s a cryptographically secured chain of blocks.

Thus...

Blockchain transactions are irreversible
The data in any given block can not be altered retroactively without altering all subsequent blocks.

Interesting to know that, Blockchain made Bitcoin the first digital currency to solve the double spending problem without the need for a trusted authority or central server.

WebSocket

WebSocket is a protocol that enables full duplex communication between computers with a single TCP connection.

Before WebSocket

Before this protocol, the most popular over-the-network communication protocol was HTTP, where the client will send a request and the server will return the response. This communication is initiated by the client only and the server responds as per the request.

On the other hand, server-sent events (Server push) were another way where based on some event at the server end client gets an event to act upon.

Note that in both options the communication is one-way, both parties are not sending or receiving simultaneously.

Why do we need WebSocket?

WebSocket enables bidirectional communication, but why do we need that?

With the HTTP approach, for each client, the server may need to use different TCP connections.
HTTP has a high overhead considering the headers sent and received with each request and response respectively.
The client should maintain the mapping of connections to keep track of responses from the server.

Here, the single TCP connection which enables traffic in both directions solves the problem which is what the WebSocket protocol does. WebSocket is a kind of replacement for bidirectional communication technologies which are based on HTTP.

WebSocket is designed to work over HTTP 80 and 443 ports. However, the design does not limit it to HTTP only.

How does it work?

WebSocket protocol has majorly two parts.

Handshake
Data transfer

Simply, the client and server do a handshake and if the handshake was successful the data transfer starts where both client and server can independently send the data. This data is sent in small units which we can refer to as “messages”. Each message is composed of one or more frames.

I tried minimal code to understand the WebSocket protocol. Feel free to try out and explore the same. Once you follow the instructions given in the readme file and run the application locally, you have to open it in a different window or tab to see that the messages you type in one tab or window automatically appear on all opened tabs or windows. Find a short video below for clarity.

Try CircleCI Orb

CircleCI is one of the popular Countinuous Integration and Continuous Delivery Platform. With CircleCI you can create your pipeline through code. You need to write your CI/CD job, steps and required commands in yml format as shown in sample configuration file below.

With the configuration file placed in your code repository under .circleci directory is automatically considered by CircleCI Cloud which reads this file and executes the jobs.

CircleCI Orb is nothing but a reusable package. In your CircleCI jobs, you may need to run similar commands repetitively, you can simply create an orb doing that comman thing and make your CircleCI configuration more readable and maintanable.

How to create orb?

Creating and sharing orb is made very simple by CircleCI team. You can simply use this template and follow the instructions given in README file that's it! To summarize you need to follow below steps which I followed to create a simple orb,

Sign up to CircleCI using GitHub
Install CircleCI CLI
Create CircleCI namespace for your organization (or user in case you don't have any organization) if not already,

circleci namespace create <name> --org-id <your-organization-id>

Initialize orb, choose to download template straight away

circleci orb init <your-orb-name>

Push your changes to GitHub repository
Check whether CircleCI pipelines running successfully or not. (ssh or lint checks may fail, you need to fix them and commit the changes until all steps succeed)
Create Release with v0.0.1 tag on GitHub
If everything is green on CircleCI, visit your orb registry to see your first orb!

https://circleci.com/developer/orbs/orb/<your-namespace>/<your-orb>

For developing and testing orb, you can create a separate branch which will publish development version, you can finally merge this development branch into master when you have something to release to your public registry.

notiforb

I have created a simple public orb called, notiforb which is kind of utility to send notification from your CircleCI pipeline to Google Chat (and more in future). To send message to Google Chat you can simply add following to your CircleCI configuration file.

1. Import Orb into your CircleCI config.

orbs:
  notiforb: akash-ccn/notiforb@0.0.1

2. Use command in Job step.

  - notiforb/gchat-notify:

      message: "Hi, this is CircleCI notifORB!"

Here multiple things happen behind the scene in gchat-notify command which is taken care inside orb. Now let's see what we have in orb code.

commands: Consider them as functions of programming where your pass parameters and execute some instructions.

jobs: Sample jobs to describe usage of orb in CircleCI registry.

examples : For showing how to use your orb.

executors : Environment where commands of orb can be executed. Depending on your orb you may or may not require executors.

@orb.yml : Contains description and display information.

scripts: Majorly in CircleCI execution we execute shell scripts on some virtual machine or container. This directory contains all scripts that are used in notiforb.

templates : This directory is specific to notiforb which contains Google Chat webhook request template to be used for sending messages.

Here the example orb is very basic and created for learning purpose only, but you can move boilerplate code to orb whether it can be comman shell script, executors, validations etc. CircleCI itself manages many public orbs and provides regular updates.

References :

CircleCI Orbs Overview

notiforb - Source

notiforb - CircleCI Registry

Java EE to Jakarta EE

Jakarta EE is nothing but aquisition and rebranding of Java EE done by Eclipse Foundation. By the time I'm writing this post, Jakarta EE 10 is the latest released version.

Few years back Oracle decided not to manage Java EE anymore and give it's ownership to an open source foundation. Finally, it was given to Eclipse Foundation, however, the Java trademarks including existing specification names can not be used by Jakarta EE specifications. Thus javax.* could not be used as package due to which it had to be renamed to something else, which decided to be jakarta.*.

So it was shift of ownership from Oracle to Eclipse Foundation. This shift ended up with concerns of backward compatibility for existing libraries and frameworks which were heavily relying on Java EE.

Basically, all the users of Java EE had to switch to this Jakarta EE in near future for long term support and updates with the cost of backward compatibility. Even Jakarta EE didn't explicitly addressed this issue but consumers of Java EE realized and started moving to Jakarta EE gradually seeing the first release of Jakarta EE 8.

This has impacted different frameworks, application servers and libraries al together including Spring, Glashfish, Tomcat, Jetty, JBoss, etc.

Few months ago Spring Boot 3 got released with GA having new baseline of Java 17 and Jakarta 9 (and support for Jakarta EE 10 as well). As you see not just Jakarta EE but even Java 17 is being pushed to be considered for upgrade considering it to be LTS. In case you are using older pre-jakarta Spring version and looking forward to upgrade to Spring Boot 3, you have to deal with updating your classes and interfaces using javax.* to jakarta.* imports if you have any.

IntelliJ Idea has option of doing this refactoring of imports Refactor > Migrate Packages and Classes > Java EE to Jakarta EE.

Bottom line is, everything which was under javax.* can now be found and referred at jakarta.*. In upcoming days we may see major changes at Jakarta which will make migration even difficult if we don't switch to it in near future.

References :

Cloud and The Carbon

The era of transition to cloud computing is at it’s peak. Everything we can think of softwares are gradually moving to cloud for different benefitting reasons comapred to on premise setup whether it can be security, maintenance, cost, availability or scalability. Cloud is now a preffered solution for most organizations. But cloud computing is not actually "Cloud" computing, it’s just connected powerful data centers placed at different geo locations which works together for us to provide lot of cloud services.

Ever thought of impact of these data centers on environment? You will be surprised to know that the carbon footprint of Cloud computing around the globe is higher than airline industry. It is true that unlimited capacity of cloud comes with negative impact on this extraordinary planet.

Let’s try to understand the overall picture.

Why it’s impacting the environment?

Carbon emission due to these data centers is negatively impacting the environment. Increase in Co2 (Carbon Dioxide) can increase overall temprature of planet resulting in global warning and climate change. Which has significant impact on the helth of all creatures that flourish on the planet including humans.

Why data centers create so much carbon?

These data centers contains lot of servers and the computation produces heat similar to your mobile phone or laptop but exponentially way more heat that that. To ensure smooth functioning the air cooling system is installed on these data centers. This air cooling systems which runs 24x7 to ensure ~100% availability are responsible for such high carbon footprint.

Such powerful data center can consume electricity equivalent of thousands of homes and out of that 40% electricity is used for this air cooling system. Worth to note that these data centers don't directly produce Co2 but the method of electricity production through hydrocarbon fuels produces the Co2.

What are the solutions to this problem?

We are so much relied on Cloud nowadays, usage and consumption is surely going to increase in upcoming years. Many Cloud providers have already started to take steps for sustainable environment by pledging to transition into becoming carbon neutral. However, stringent and transparent agency is required to review and recommend solutions to the cloud providers.

Interestingly, there is an example of sustainable data center, EcoDataCenter of Sweden which uses free cooling method.

References :

Java 17 PRNG API

Java 17 come up with new API for Pseudo-Random Number Generators. There are several resons behind introducing new PRNG API. Which provides uniform ways to use existing and new PRNG algorithms.

New interface is introduced due to following reasons,

ThreadLocalRandom extends Random class and overrides pretty similar set of methods which can be avoided by using interface.
To replace usage of SplittableRandom PRGN objects with Random we have to change every variable even though the classes having some similar behaviors.
Code duplication in Random, SplittableRandom and ThreadLocalRandom
Make it easy to introduce new PRGN implementation without much code changes.
Provide support for PRGNs which are jumpable or leapable. (i.e. Xoroshiro128+)

What does new Random generator API provides?

Following diagram can explain hierarchy of random generators,

As we can see RandomGenerator is the parent interface extended by StreamableGenerator interface which enables streams of object to generate statistically independent objects.

Two child interfaces SplittableGenerator and JumpableGenerator extends StreamableGenerator interface which are introduced as part of Java 17.

Also, old Random class is now extending the RandomGenerator, which is done to bring older random generation into one umbrella and preserve it. As a result child class SecureRandom now automatically supports RandomGenerator. Thus we won’t need to reimplement SecureReandom class or it’s associated implementations.

`SplittableGenerator`

Allows to create new SplittableGenerator from existing one which will generate statistically independent results. After split, high probability is that both objects will together generate values with same statistical properties like it is generated by single object. Provides two major split() and splits() methods to return split off and effectively unlimited stream of splits from existing generator.

Worth to note that the SplittableRandom (which is an implementation of SplittableGenerator) instances are not thread-safe.

Java 17 also has implementation of LXM family of PRNG algorithms which includes following classes,

L32X64MixRandom
L32X64StarStarRandom
L64X128MixRandom
L64X128StarStarRandom
L64X256MixRandom
L64X1024MixRandom
L128X128MixRandom
L128X256MixRandom
L128X1024MixRandom

`JumpableGenerator`

Allows to generate random values and jump to a point in state cycle.

Provides jump() method, which jump forward to fixed distance (typically 2⁶⁴). Also, jumps() method returns effectively unlimited stream of object post jumping forward. This are major methods, but it also provides other methods as well like jumpDistance() which returns the distance by which the jump() method will jump forward, copyAndJump() which copies existing generator, jumps it forward and returns the copy etc. Unlike split() which returns the split off object, jump() is void which just jumps.

LeapableGenerator interface extends JumpableGenerator. Which allow to jump ahead a large number of draws.

Following implementations of JumpableGenerator are provided in Java 17,

Xoshiro256PlusPlus
Xoroshiro128PlusPlus

Sample program to get implemented random generators in Java 17, Output:

Provided RandomGenerators :
L32X64MixRandom
L128X128MixRandom
L64X128MixRandom
SecureRandom
L128X1024MixRandom
L64X128StarStarRandom
Xoshiro256PlusPlus
L64X256MixRandom
Random
Xoroshiro128PlusPlus
L128X256MixRandom
SplittableRandom
L64X1024MixRandom
---------
Create RandomGenerator :Xoshiro256PlusPlus
-1083384208

Reference :

JEP 356: Enhanced Pseudo-Random Number Generators

CircleCI Security Incident

Security incidents are nighmares to the companies as it significantly impacts reputation and credibility. Recently, CircleCI faced a security incident and forced to alert customers to rotate or revoke their secrets.

I thought of understanding the overall incident in detail and try to analyze the overall timeline to understand why and how it happened.

Before I start, it’s worth to appreciate the transparency extended by CircleCI, they not just alerted their customers and took necessary steps to ensure secured environment but also shared every possible details of the incident on public domain for their users.

Let’s try to understand the incident in detail now.

This attack took place through Malware which was capable to perform actions which can steal, damage or destroy things on Circle CI environment. This Malware was deployed on one of CircleCI engineer’s Machine and from this machine it may intended to impersonate and spread further since it’s having production system reach.

CircleCI reportedly said that it was intended to steal valid 2FA session, execute session cookies theft to get access of their production system.

It’s surprising to know that this malware was not detected by CircleCI’s antivirus software. That’s unfortunate, but what exactly malware is, we need to understand it first to understand what sort of problems it can create.

What is Malware?

Malware is malicious software intended for destruction, theft, encryption, alter of data or component in any system. There are few types which you might have heard of include trojan horse, ransomware, spyware etc. all are kind of malwares. These malwares are keep changed or improved continuously by hackers to be used for aforementioned destructive intentions. Thus some antivirus softwares also can’t detect them.

But how can it reach to your system?

There are different ways, I’m listing few of them,

Download software from unauthentic websites
Download attachment from spam
Connect external drive which has malware
Join insecure network etc.

Coming back to CircleCI incident, since team was totally in damage control mode, their investigation identified that the malware was capable of doing things which can compromise their production environment since the engineer on whose machine the malware was deployed was having access to the production for his work.

The idea of intrusion can be to impersonate (through cookie hijacking, stealing 2FA session data etc.) and gain production system access.

But why CircleCI asked customers to rotate their secrets?

This step was precausionary to ensure secure and clean system for both CircleCI as well as Customers.

As soon as the CircleCI team detected the unauthorized activity, it immediately took action to turn down this malware from accessing and doing any sort of damage.

By rotating secrets immediately they can get assurance that even if the attacker might have accessed any of the secret information, they will not be able to use them to speard and damage further in system.

Following diagram can help to visualize the incident easily,

What are the steps CircleCI took to avoid this in future?

Now this is very important question since this incident is a lesson to learn, not just for CircleCI but for any other company.

From their report, I can summarize following steps,

Restrict production access to limited employees
Monitoring and alerting systems
Additional authentication on top of 2FA for prodution
Detect and block malware through Antivirus Softwares

Did it cause any damage?

None so far as per the report.

However, I can give a sample scenario.

Consider we have an application deployed on AWS Infrastructure. For the sake of simplicity you stored your AWS access secrets to your CircleCI environment variables (neither I would recommend to use access key nor to use CircleCI environment variables for storing sensitive information).

Somehow attacker got these AWS access and secret key and now it can access your AWS resources for which this key has access.

Assume we are not aware that CircleCI environment variables are compromised and we keep the secrets in environment variable. Unknowingly, our AWS infrastructure is at risk!

This is just an example to explain the possibility and severity of threat and very unlikely to happen. Real time scenarios can be more complex and destructive in nature when it comes to security attack.

Having said that, the data of which domain is compromised makes a lot difference here as well. Domains like health care can not take chance of even a minor security incident and needs the highest level of surity in terms of security.

Report also mentioned a worthy line regarding security,

Security work is never done.

References

JVM Overview

What is JVM?

JVM stands for Java Virtula Machine. To understand JVM let’s first understand what exactly VM (Virtual Machine) is.

Virtual Machine, as name suggests, it is a machine which is virtual (non-physical). Basically, VM is a software which behaves like an actual physical machine. You can run one or more VM on actual physical machine, just like we use any applications on our laptop.

Alright, so now we have basic idea of what VM is, let’s understand JVM.

JVM is an abstract computing machine which…

extends runtime environment for execution of Java bytecode.
enables operating system independence of Java.
enables hardware inpenedence of Java.
protects users from malicious programs. etc.

JVM has different responsibilities and features but in nutshell it executes provided Java byte code instructions.Actually, it does not even know anything about Java, it just knows about class file format which contains the instructions (or bytecode).

JVM Architecture

Class Loader Subsystems

To execute the byte code in JVM it first needs to be prepared for execution, this class loader subsystem takes care of that with the help of following steps,

Loading : Loads binary representation of class or interface into method area.
Linking : Includes verification, preparation and resolution
Initializing : Executes initialization method <clinit>, of class or interface.

Let’s understand each of them in more details.

Loading

Loading reads .class file and generates implementation specific bytecode and stores it into method area.

Threre are three categories of class loaders in JVM,

Bootstrap Class Loader
- Loads rt.jar file (contains essential java runtime libraries (classes)).
- Starting point of class loading process
- Written in Native code
Extension Class Loader : Loads Jar located inside $JAVA_HOME/jre/lib/ext directory. (extension to existing runtime libraries)
Application Class Loader : Loads files from the classpath.

With following program we will be able to understand a bit more,

Output

Class Loaders used for CheckClassLoader class
sun.misc.Launcher.AppClassLoader
sun.misc.Launcher.ExtClassLoader
--
Class Loaders used for String
null

As you can see for our CheckClassLoader, AppClassLoader is used. Also, the parent of AppClassLoader is ExtClassLoader which is nothing but Extension Class Loader.

Apparently, the String class is directly loaded with BootStrap Class Loader which is written in native code and thus we don’t get Java class of it and the program prints null.

The overall process can be summarized with following diagram.

Linking

Linking step involves following,

Verification of binrary representation structure. If the verification of binanry represnetation doesn’t match with the expected static and structural constraints, VerifyError is thrown.
- Constraints includes,
  - Static constraints ensures correct form of the binary representation.
  - Structural constraints ensures well defined relationship between the instructions.
Preparation creates static fields and initializes them with default values. This step does not require execution of JVM code but explicit initializers are executed.
Resolution of symbolic references of run-time constant pool.
- JVM instruction like getstatic, invokespecial, invokevirtual, instanceof, new etc. rely on symbolic references in run-time constant pool.
- Runtime constant pool is used for multiple purposes but in this context specifically, it contains a symbol table. Basically, it provides the actual entity associated with the given symbolic reference.

Consider following HelloWorld program, if we disassemble our HelloWorld with javap we can see multiple code items listed, which includes symbolic references (i.e. ldc, getstatic etc.). So, the resolution step actually resolves these symbolic references with the help of runtime constant pool.

Initialization

This step is all about executing initialization method of class or interface.

However, specific class or interface can only be initialized in one of the following cases :

Due to execution of JVM instructions new, getstatic, putstatic or invokestatic which references the particular class.
Due to initialization of its subclasses.
If it is designated as the initial class or inteface at JVM start up.
If its an interface and the initialization of a class that implements it directly and indirectly happens.
Through invocation of reflective method.

Runtime Data Area

JVM holds multiple data (memory) areas which are utilised for different purposes. Two kinds of data areas exists,

JVM Level : Created on JVM start up and gets cleared up on JVM exit.
Thread Level : Created for individual Java Thread and gets destroyed when Thread exits.

Following are different data areas that exists in JVM,

pc Register

pc stands for Program Counter, used to store context of the thread or in other way gives address of JVM instruction being executed for particular thread. This context is usefull to support multi threaded environment by JVM.

pc Registers are simply data which provides information about Thread’s state to the JVM.

JVM Stacks

JVM Stacks is created for each thread and it stores frames.

Each Thread will have some methods to execute, for such execution we need to store following data somewhere,

sequence of methods
local variables of particular method
partial results during method execution

JVM Stacks can be fixed size or dynamically expanding. Also, the memory allocation for JVM stack need not to be contagious.

We can use -Xss argument to allocate size to JVM stack.

java -Xss 32m HelloWorld

Two well known exceptions can occur due to JVM Stacks,

StackOverflowError if computation required by Thread requires higher stacks than permitted by JVM stacks.
OutOfMemoryError if JVM stacks is dynamically expanded but expansion is not possible due to insufficient memory.

Heap

Heap is created on JVM start up and shared across all JVM threads. This is the place where memory for objects and arrays are allocated.

Heap for objects is reclaimed by Garbage Collector (automatic memory management system of Java). Heap can be of fixed size or can expand or contracted based on requirement.

Memory allocated to heap need not to be contagious in nature.

We can configure the heap size with following JVM arguments,

-Xms : size of heap required on start up.
-Xmx : maximum allowed size of heap.

In case the memory is not available to allocate to the Heap, it throws OutOfMemoryError.

Method Area

Method area is shared among the JVM threads and created during the JVM start up.

It stores, run-time contant pool, field and method data, code for method data etc. per class basis.

Logically method area is part of the heap. Similar to heap, the memory allocated to method area need not to be contagious. Also, it can be fixed size or expandable (and contractable as well).

In case the memory is not available to allocate to Method Area it throws OutOfMemoryError.

The overall runtime data area can be visualised with following diagram,

Execution Engine

This is core of JVM which executes the Java byte code. It actually converts the bytecode to machine code and executes it.

It has three major components,

Interpreter
- Reads and converts byte code to machine code.
- Interprets line of instruction to machine instruction in isolation.
- One drawback is, it actually interprets everytime, even the same method comes multiple times which decreases the performance of the system.
JIT (Just In Time) Compiler
- The compilation from byte code to machine code is done at runtime in optimized way.
- Increases performance by overcoming slow execution drawback of interpreter.
Garbage Collector
- Performs automatic memory management for Java.

Note : For the sake of simplicity, things like native methods, class format structure, workings of garbage collector etc. are kept out of the scope of this article.

Reference :

https://docs.oracle.com/javase/specs/jvms/se17/html/index.html