Security incidents are nighmares to the companies as it significantly impacts reputation and credibility. Recently, CircleCI faced a security incident and forced to alert customers to rotate or revoke their secrets.

I thought of understanding the overall incident in detail and try to analyze the overall timeline to understand why and how it happened.

Before I start, it’s worth to appreciate the transparency extended by CircleCI, they not just alerted their customers and took necessary steps to ensure secured environment but also shared every possible details of the incident on public domain for their users.

Let’s try to understand the incident in detail now.

This attack took place through Malware which was capable to perform actions which can steal, damage or destroy things on Circle CI environment. This Malware was deployed on one of CircleCI engineer’s Machine and from this machine it may intended to impersonate and spread further since it’s having production system reach.

CircleCI reportedly said that it was intended to steal valid 2FA session, execute session cookies theft to get access of their production system.

It’s surprising to know that this malware was not detected by CircleCI’s antivirus software. That’s unfortunate, but what exactly malware is, we need to understand it first to understand what sort of problems it can create.

What is Malware?

Malware is malicious software intended for destruction, theft, encryption, alter of data or component in any system. There are few types which you might have heard of include trojan horse, ransomware, spyware etc. all are kind of malwares. These malwares are keep changed or improved continuously by hackers to be used for aforementioned destructive intentions. Thus some antivirus softwares also can’t detect them.

But how can it reach to your system?

There are different ways, I’m listing few of them,

Download software from unauthentic websites
Download attachment from spam
Connect external drive which has malware
Join insecure network etc.

Coming back to CircleCI incident, since team was totally in damage control mode, their investigation identified that the malware was capable of doing things which can compromise their production environment since the engineer on whose machine the malware was deployed was having access to the production for his work.

The idea of intrusion can be to impersonate (through cookie hijacking, stealing 2FA session data etc.) and gain production system access.

But why CircleCI asked customers to rotate their secrets?

This step was precausionary to ensure secure and clean system for both CircleCI as well as Customers.

As soon as the CircleCI team detected the unauthorized activity, it immediately took action to turn down this malware from accessing and doing any sort of damage.

By rotating secrets immediately they can get assurance that even if the attacker might have accessed any of the secret information, they will not be able to use them to speard and damage further in system.

Following diagram can help to visualize the incident easily,

What are the steps CircleCI took to avoid this in future?

Now this is very important question since this incident is a lesson to learn, not just for CircleCI but for any other company.

From their report, I can summarize following steps,

Restrict production access to limited employees
Monitoring and alerting systems
Additional authentication on top of 2FA for prodution
Detect and block malware through Antivirus Softwares

Did it cause any damage?

None so far as per the report.

However, I can give a sample scenario.

Consider we have an application deployed on AWS Infrastructure. For the sake of simplicity you stored your AWS access secrets to your CircleCI environment variables (neither I would recommend to use access key nor to use CircleCI environment variables for storing sensitive information).

Somehow attacker got these AWS access and secret key and now it can access your AWS resources for which this key has access.

Assume we are not aware that CircleCI environment variables are compromised and we keep the secrets in environment variable. Unknowingly, our AWS infrastructure is at risk!

This is just an example to explain the possibility and severity of threat and very unlikely to happen. Real time scenarios can be more complex and destructive in nature when it comes to security attack.

Having said that, the data of which domain is compromised makes a lot difference here as well. Domains like health care can not take chance of even a minor security incident and needs the highest level of surity in terms of security.

Report also mentioned a worthy line regarding security,

Security work is never done.

References

What is JVM?

JVM stands for Java Virtula Machine. To understand JVM let’s first understand what exactly VM (Virtual Machine) is.

Virtual Machine, as name suggests, it is a machine which is virtual (non-physical). Basically, VM is a software which behaves like an actual physical machine. You can run one or more VM on actual physical machine, just like we use any applications on our laptop.

Alright, so now we have basic idea of what VM is, let’s understand JVM.

JVM is an abstract computing machine which…

extends runtime environment for execution of Java bytecode.
enables operating system independence of Java.
enables hardware inpenedence of Java.
protects users from malicious programs. etc.

JVM has different responsibilities and features but in nutshell it executes provided Java byte code instructions.Actually, it does not even know anything about Java, it just knows about class file format which contains the instructions (or bytecode).

JVM Architecture

Class Loader Subsystems

To execute the byte code in JVM it first needs to be prepared for execution, this class loader subsystem takes care of that with the help of following steps,

Loading : Loads binary representation of class or interface into method area.
Linking : Includes verification, preparation and resolution
Initializing : Executes initialization method <clinit>, of class or interface.

Let’s understand each of them in more details.

Loading

Loading reads .class file and generates implementation specific bytecode and stores it into method area.

Threre are three categories of class loaders in JVM,

Bootstrap Class Loader
- Loads rt.jar file (contains essential java runtime libraries (classes)).
- Starting point of class loading process
- Written in Native code
Extension Class Loader : Loads Jar located inside $JAVA_HOME/jre/lib/ext directory. (extension to existing runtime libraries)
Application Class Loader : Loads files from the classpath.

With following program we will be able to understand a bit more,

Output

Class Loaders used for CheckClassLoader class
sun.misc.Launcher.AppClassLoader
sun.misc.Launcher.ExtClassLoader
--
Class Loaders used for String
null

As you can see for our CheckClassLoader, AppClassLoader is used. Also, the parent of AppClassLoader is ExtClassLoader which is nothing but Extension Class Loader.

Apparently, the String class is directly loaded with BootStrap Class Loader which is written in native code and thus we don’t get Java class of it and the program prints null.

The overall process can be summarized with following diagram.

Linking

Linking step involves following,

Verification of binrary representation structure. If the verification of binanry represnetation doesn’t match with the expected static and structural constraints, VerifyError is thrown.
- Constraints includes,
  - Static constraints ensures correct form of the binary representation.
  - Structural constraints ensures well defined relationship between the instructions.
Preparation creates static fields and initializes them with default values. This step does not require execution of JVM code but explicit initializers are executed.
Resolution of symbolic references of run-time constant pool.
- JVM instruction like getstatic, invokespecial, invokevirtual, instanceof, new etc. rely on symbolic references in run-time constant pool.
- Runtime constant pool is used for multiple purposes but in this context specifically, it contains a symbol table. Basically, it provides the actual entity associated with the given symbolic reference.

Consider following HelloWorld program, if we disassemble our HelloWorld with javap we can see multiple code items listed, which includes symbolic references (i.e. ldc, getstatic etc.). So, the resolution step actually resolves these symbolic references with the help of runtime constant pool.

Initialization

This step is all about executing initialization method of class or interface.

However, specific class or interface can only be initialized in one of the following cases :

Due to execution of JVM instructions new, getstatic, putstatic or invokestatic which references the particular class.
Due to initialization of its subclasses.
If it is designated as the initial class or inteface at JVM start up.
If its an interface and the initialization of a class that implements it directly and indirectly happens.
Through invocation of reflective method.

Runtime Data Area

JVM holds multiple data (memory) areas which are utilised for different purposes. Two kinds of data areas exists,

JVM Level : Created on JVM start up and gets cleared up on JVM exit.
Thread Level : Created for individual Java Thread and gets destroyed when Thread exits.

Following are different data areas that exists in JVM,

pc Register

pc stands for Program Counter, used to store context of the thread or in other way gives address of JVM instruction being executed for particular thread. This context is usefull to support multi threaded environment by JVM.

pc Registers are simply data which provides information about Thread’s state to the JVM.

JVM Stacks

JVM Stacks is created for each thread and it stores frames.

Each Thread will have some methods to execute, for such execution we need to store following data somewhere,

sequence of methods
local variables of particular method
partial results during method execution

JVM Stacks can be fixed size or dynamically expanding. Also, the memory allocation for JVM stack need not to be contagious.

We can use -Xss argument to allocate size to JVM stack.

java -Xss 32m HelloWorld

Two well known exceptions can occur due to JVM Stacks,

StackOverflowError if computation required by Thread requires higher stacks than permitted by JVM stacks.
OutOfMemoryError if JVM stacks is dynamically expanded but expansion is not possible due to insufficient memory.

Heap

Heap is created on JVM start up and shared across all JVM threads. This is the place where memory for objects and arrays are allocated.

Heap for objects is reclaimed by Garbage Collector (automatic memory management system of Java). Heap can be of fixed size or can expand or contracted based on requirement.

Memory allocated to heap need not to be contagious in nature.

We can configure the heap size with following JVM arguments,

-Xms : size of heap required on start up.
-Xmx : maximum allowed size of heap.

In case the memory is not available to allocate to the Heap, it throws OutOfMemoryError.

Method Area

Method area is shared among the JVM threads and created during the JVM start up.

It stores, run-time contant pool, field and method data, code for method data etc. per class basis.

Logically method area is part of the heap. Similar to heap, the memory allocated to method area need not to be contagious. Also, it can be fixed size or expandable (and contractable as well).

In case the memory is not available to allocate to Method Area it throws OutOfMemoryError.

The overall runtime data area can be visualised with following diagram,

Execution Engine

This is core of JVM which executes the Java byte code. It actually converts the bytecode to machine code and executes it.

It has three major components,

Interpreter
- Reads and converts byte code to machine code.
- Interprets line of instruction to machine instruction in isolation.
- One drawback is, it actually interprets everytime, even the same method comes multiple times which decreases the performance of the system.
JIT (Just In Time) Compiler
- The compilation from byte code to machine code is done at runtime in optimized way.
- Increases performance by overcoming slow execution drawback of interpreter.
Garbage Collector
- Performs automatic memory management for Java.

Note : For the sake of simplicity, things like native methods, class format structure, workings of garbage collector etc. are kept out of the scope of this article.

Reference :

https://docs.oracle.com/javase/specs/jvms/se17/html/index.html

Labels

Archive

Tuesday, January 24, 2023

CircleCI Security Incident

What is Malware?

But how can it reach to your system?

But why CircleCI asked customers to rotate their secrets?

What are the steps CircleCI took to avoid this in future?

Did it cause any damage?

References

Friday, January 20, 2023

JVM Overview

What is JVM?

JVM Architecture

Class Loader Subsystems

Loading

Linking

Initialization

Initialization

Runtime Data Area

Runtime Data Area

pc Register

JVM Stacks

Heap

Method Area

Execution Engine

Akash Thakare

Software Consultant

Sharing insights from my tech journey to inspire and learn together.