Archive for the ‘General’ Category

I found this very useful article today morning.
Couldn’t resist sharing it with you all.

This infographic was created by Loggly.

HTTP Status Code Diagram

Advertisements

Solution for N + 1 problem in Hibernate

Posted: November 13, 2018 in General

Suppose we have a class Book with a many-to-one relationship with Author.

We solve this problem by making sure that the initial query fetches all the data needed to load the objects we need in their appropriately initialized state.
One way of doing this is using an HQL fetch join.

We use the HQL

from Book book join fetch book.author author” with the fetch statement.

This results in an inner join:

select BOOK.id from BOOK and author … from
BOOK inner join AUTHOR on BOOK.AUTHOR_ID=AUTHOR.id

Using a Criteria query we can get the same result from

Criteria criteria = session.createCriteria(Book.class);
criteria.setFetchMode(“author”, FetchMode.EAGER);

which creates the SQL :

select BOOK.id from BOOK
left outer join AUTHOR on BOOK.AUTHOR_ID=AUTHOR.id where 1=1;

in both cases, our query returns a list of Book Objects with the Author initialized.
Only one query needs to be run to return all the contact and manufacturer information required

What is AWS S3?

S3 stands for Simple Storage Service.

Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, and inexpensive data storage infrastructure.

Unlike the other storage systems like Unix file system, HDFS (i.e. Hadoop Distributed File System), etc which are based on having folders & files, the S3 is based on a concept of a “key” and a “object“. Amazon S3 stores data as objects within a bucket, which is a logical unit of storage. An object consists of a file and optionally any metadata that describes that file.

To store an object in Amazon S3, you upload the file you want to store to a bucket. When you upload a file, you can set permissions on the object as well as any metadata. Buckets are the containers and you control access per bucket, view access logs for it and its objects, and choose the geographical region where Amazon S3 will store the bucket and its contents. Customers are not charged for creating buckets, but are charged for storing objects in a bucket and for transferring objects in and out of buckets.

Amazon S3 data model is a flat structure, and there is no hierarchy of sub-buckets or sub-folders. You can infer logical hierarchy using key name prefixes and delimiters as the Amazon S3 console supports a concept of folders for example

documents/csv/datafeed.csv

Each Amazon S3 object has data (e.g. a file), a key, and metadata (e.g. object creation date, privacy classification like protected,sensitive, public etc).  A key uniquely identifies the object in a bucket.  Object metadata is a set of name-value pairs. You can set object metadata when you upload it. Metadata cannot be modified after uploading, but you can make a copy of the object and set the new metadata.

Advantage of S3

  • Elasticity

If you were to use HDFS on Amazaon EC2 (i.e. Elastic Compute Cloud) infrastructure, and if your storage requirements grow you need to add AWS EBS (i.e. Elastic Block Storage) and other resources in the EC2 infrastructure to scale up. You also need to take additional steps for monitoring, back ups & disaster recovery.

The S3 decouples compute against the storage requirements. This decoupling allows you to easily (i.e. elastically) scale up or down the storage requirements.

S3’s opt-in versioning feature automatically maintains backups of modified or deleted files, making it easy to recover from accidental data deletion.

  • Cost

S3 is 3 to 5 cheaper that WS EBS (i.e. Elastic Block Storage) used by HDFS.

  • Performance

S3 consumers don’t have the data locally, hence all reads need to transfer data across the network, and S3 performance tuning itself is a black box. Since HDFS data is more local to it, it is much faster(e.g. 3 to 5 times) than S3.  S3 has a higher read/write latency than HDFS.

  • Availability & Durability & Security

Availability guarentees system uptime and Durability guarantees that the data that gets written will survive permanently.  S3 claims 99.999999999% durability and 99.99% availability as opposed to HDFS on EBS gives an availability of 99.9%

S3’s cross-region replication feature can be used for disaster recovery & enhances its strong availability by withstanding the complete outage of an AWS region.

S3 has easy-to-configure audit logging and access control capabilities. These features along with multiple types of encryption makes S3 easy to meet regulatory compliance needs such as PCI (i.e. Payment Card Industry) or HIPAA (i.e. Health Insurance Portability and Accountability Act) compliance.

  • Multipart Upload

You can now break your larger objects (e.g. > 100 MB) into chunks and upload a number of chunks in parallel. If the upload of a chunk fails, you can simply restart it.
You’ll be able to improve your overall upload speed by taking advantage of parallelism.

For example, you can break a 10 GB file into as many as 1024 separate parts and upload each one independently, as long as each part has a size of 5 MB or more.
If an upload of a part fails it can be restarted without affecting any of the other parts.
S3 will return an ETag in response to each part uploaded. Once you have uploaded all of the parts you can ask S3 to assemble the full object with another call to S3.

 

  • Large monolith architectures are broken down into many small services.
    • Each service runs in its own process.
    • The applicable cloud rule is one service per container.
  • Services are optimized for a single function.
    • There is only one business function per service.
    • The Single Responsibility Principle: A microservice should have one, and only one, reason to change.
  • Communication is through REST API and message brokers.
    • Avoid tight coupling introduced by communication through a database.
  • Continuous integration and continuous deployment (CI/CD) is defined per service.
    • Services evolve at different rates.
    • You let the system evolve but set architectural principles to guide that evolution.
  • High availability (HA) and clustering decisions are defined per service.
    • One size or scaling policy is not appropriate for all.
    • Not all services need to scale; others require auto scaling up to large numbers.

Lombok

Posted: June 9, 2018 in General, Java, Java8
Tags: , ,

Lets take a look at a following sample code.

import java.io.Serializable;
import java.util.Objects;

public class User implements Serializable {

    private long id;
    private String username;
    private String login;

    public long getId() {
        return id;
    }

    public void setId(long id) {
        this.id = id;
    }

    public String getUsername() {
        return username;
    }

    public void setUsername(String username) {
        this.username = username;
    }

    public String getLogin() {
        return login;
    }

    public void setLogin(String login) {
        this.login = login;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        User user = (User) o;
        return id == user.id &&
                Objects.equals(username, user.username) &&
                Objects.equals(login, user.login);
    }

    @Override
    public int hashCode() {

        return Objects.hash(id, username, login);
    }
}

A class should have getter-setters for the instance variables, equals & hashCode method implementation, all field constructors and an implementation of toString method. This class so far has no business logic and even without it is 50+ lines of code. This is insane.

Lombok is used to reduce boilerplate code for model/data objects, e.g., it can generate getters and setters for those object automatically by using Lombok annotations. The easiest way is to use the @Data annotation.

import java.io.Serializable;
import lombok.data

@Data
public class User implements Serializable {

    private long id;
    private String username;
    private String login;
}

How to add Lombok to your java project ?

Using Gradle

dependencies {
    compileOnly('org.projectlombok:lombok:1.16.20')
}

Using Maven

<dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
    <version>1.16.20</version>
</dependency>

Tips to remember while using Lombok

  1. Don’t mix logic with lombok
  2. Use @Data for your DAOs
  3. Use @Value for immutable value objects
  4. Use @Builder when you have an object with many fields with the same type
  5. Exclude generated classes from Sonar report. If you are using Maven and Sonar, you can do this using the sonar.exclusions property.

Understanding the CAP theorem

Posted: June 3, 2018 in General
Tags:

Finding the ideal database for your application is largely a choice between trade-offs. The CAP theorem is one concept that can help you understand the trade-offs between different databases. The CAP theorem was originally proposed by Eric Brewer in 2000. It was originally conceptualized around network shared data and is often used to generalize the tradeoffs between different databases. The CAP theorem centers around three desirable properties; consistency is where all users get the same data, no matter where they read the data from, availability ensures users can always read from and write to the database, and finally partition tolerance ensures that the database works when divided across network.

The theorem states that at most you can only guarantee two of the three properties simultaneously. So you can have an available partition- tolerant database, a consistent partition-tolerant database or a consistent available database. One thing to note is that not all of these properties are necessarily exclusive of each other. You can have a consistent partition-tolerant database that still has an emphasis on availability, but you’re going to sacrifice either part of your consistency or your partition tolerance.

Relational databases trend towards consistency and availability. Partition tolerance is something that relational databases typically don’t handle very well. Often you have to write custom code to handle the partitioning of relational databases. NoSQL databases on the other hand trend towards partition-tolerance. They are designed with the idea in mind that you’re going to be adding more nodes to your database as it grows. CouchDB, which we looked at earlier in the course, is an available partition-tolerant database.

That means the data is always available to read from and write to, and that you’re able to add partitions as your database grows. In some instances, the CAP theorem may not apply to your application. Depending on the size of your application, CAP tradeoffs may be irrelevant.If you have a small or a low traffic website, partitions may be useless to you, and in somecases consistency tradeoffs may not be noticeable. For instance, the votes on a comment may not show up right away for all users.

This is fine as long as all votes are displayed eventually. The CAP theorem can be used as a guide for categorizing the tradeoffs between different databases. Consistency, availability, and partition tolerance are all desirable properties in a database. While you may not be able to get all three in any single database system, you can use the CAP theorem to help you decide what to prioritize.

To build a Java application, the first step is to create a Java project. Most Java projects rely on third-party Java archive dependencies, and these third-party archives usually have dependencies of their own. On top of that, each version of the dependencies rely on other versions. Managing all these dependencies is a nightmare that Java developers have nicknamed JAR hell. To avoid JAR hell, we use build dependency management systems like Maven or Gradle.

But even with Maven and Gradle, versioning between individual .jar files can be a nuisance.Spring Boot recognizes this, and created the notion of a Spring Boot Starter, which bundles several dependencies into a grouping that is easier to manage. There are a lot, and I mean a lot of Spring Boot Starter dependencies so even cobbling together a project on your own can be difficult. This is where Spring Initializr comes to the rescue. Spring Initializr is a tool for creating Spring Boot Java projects by answering a series of questions and selecting check boxes to choose which features to include.

Initializr creates the package structure, the pom.xml for Maven, or build.gradle for Gradle files, and any required Java source classes.

Lets see how to use Spring Initializr

Step 1 : Goto https://start.spring.io/

Step 2 : Choose a java project with maven and latest spring boot support

Screen Shot 2018-05-19 at 8.38.24 PM.png

Step 3: If you want to see more options, click on ‘switch to full version’ link at the bottom of the page.

Screen Shot 2018-05-19 at 8.40.14 PM.png

Step 4 : Choose Spring Starter Packages

Now, we’re going to scroll past the Generate Project button and look at all of these Spring Starter packages, and from these we’re going to choose Web and within Web is Rest Repositories.

Screen Shot 2018-05-19 at 8.43.41 PM

And then keep scrolling, and we get to the SQL part, we’re going to choose JPA and H2.

Screen Shot 2018-05-19 at 8.43.54 PM

Now we’re going to go back and click the Generate Project button

Now Spring Initializr will generate the zip file. I will copy it to my working folder and unzip the file there and start working on your project. 🙂