Friday, 31 July 2020

Do You Need To Submit Blogger Sitemap In Google Search Console

Google crawler should be able to index your blog posts eventually. You don't have to submit it. But it is recommended to submit your blogspot blog sitemap to Google Search Console. It might help you get your posts or pages indexed in Google quicker.
For this, you don't need any custom sitemap for Blogger blog. Sitemap is automatically generated by Blogger.

For posts:
http://blog-name.blogspot.com/sitemap.xml
http://custom-domain.com/sitemap.xml

For pages:
http://blog-name.blogspot.com/sitemap-pages.xml
http://custom-domain.com/sitemap-pages.xml

If you are using custom domain name, you need to use custom domain urls mentioned above. You can check urls for your blog in the browser & submit them in Google Search Console. If you don't have pages in your blog, then don't submit sitemap for pages.

Thursday, 30 July 2020

Use Notepad++ As Offline JSON Validator

You can find plenty of websites to validate & view JSON text. JSONLint is pretty popular. But I have had issues with them when file size goes beyond 3-4 mb. I mostly use this website to validate & parse JSON. It works well with big file size. So if you have a large JSON string to validate, it can be useful.
Now if you are on Windows & already using NotePad++, you can use it as an offline JSON validator tool. Just install this plugin. Now you can validate & see JSON text in tree view inside the text editor. It works well with large text also.

Wednesday, 29 July 2020

Difference Between Wait AND Park Methods In Java Thread

If you are familiar with thread dump or at least looked at it couple of times, probably you would have noticed "WAITING"  & "WAITING (parking)" states. So what's the difference?
In Java we have Objects.wait() and Unsafe.park() methods. Both of them will suspend the running thread & put it in waiting state. But these two methods work on different principles.
Object.wait() method works on monitor based synchronization. So that works well with "happens-before" relationship in Java. To bring back a waiting thread in runnable state, we will use Object.notify() method on the same monitor object. So when the thread comes back to runnable state, it will surely get updated values of the variables shared across multiple threads. JVM will make sure thread state is synchronized with main memory, but that is the additional overhead.
Unsafe.park() method takes the thread as an argument. To move back a parked thread to runnable state, we need to call Unsafe.unpark() method on the same thread. It works on permit basis. When Unsafe.unpark() is called, it will unblock the thread if it is already parked or will make sure the next park call on the thread is unblocked immediately. So its performance should be better as no synchronization is required with main memory. That's why thread pool in general (for example ExecutorService) use park method while waiting for tasks from the blocking queue.
As you can see, these use cases are different. If you have state that is shared across threads & you want to make sure one thread should wait for other thread before proceeding with the update, then you should go ahead with wait() and notify() methods. As an application developer, mostly you won't have to use park() method, it's too low level of an API.

Tuesday, 28 July 2020

Find Pattern And Its Anagrams In A Text

Suppose we need to find if some text contains a pattern or any of its anagrams. I will explain how you can do it in O(n) time.
Let's consider the example below:
Text: cedaabd
Pattern: abad

Here are all possible 4-letter anagrams of abad:
aabd aadb abad abda adab adba baad bada bdaa daab daba dbaa

So we have 12 patterns to match. That is not feasible. Instead we will match count of each unique characters in pattern with that of current text window.
We will follow the below steps with above example.

Pattern character count:
a b c d e
2 1 0 1 0

Why are we counting c & e though these characters are not present in pattern? We want to have a fixed set of characters to compare to.
Otherwise, when we take a character from text, we will have to see first if the character is present in the pattern & then increment it. We want to avoid extra comparison.
Suppose, our text only contains english letters, so we know our table length will be 26. For our example, we have considered 5 characters a, b, c, d, e.

Now let's first consider first 4 characters from text & increment it in each step.

t[1] to t[4]
a b c d e
1 0 1 1 1
no match with pattern count, increment t.

t[2] to t[5]
a b c d e
2 0 0 1 1
no match with pattern count, increment t.

t[3] to t[6]
a b c d e
2 1 0 1 0
count matched with pattern count, so we found a match, increment t to check next character.

t[4] to t[7]
a b c d e
2 1 0 1 0
count matched with pattern count, so we found another match, this is also end of text.

As you can see we have found 2 matches in the text which are anagrams of the pattern. So what is the worst time complexity? If we recalculate text character count in each step, that will require m operations.
So time complexity would be O(mn). But we can optimize that with sliding window technique. Once count has been calculated for the first time, we can do the following:
Decrement first character count from the previous value.
Add current character count to the previous value.

For example, t[1] to t[4] count:
a b c d e
1 0 1 1 1

Now t[2] to t[5] can be counted as below:
decrement t[1] count from above. t[1] is c. So table would look like this:
a b c d e
1 0 0 1 1
Increment t[5] value. t[5] is a. So final table is
a b c d e
2 0 0 1 1

So with two operations we can recalculate hash in each step & it doesn't depend on m. So time complexity becomes O(n).

Another thing to consider is the time taken to compare values of pattern table with text table in each step. But as table length is fixed (here it is 5) & it doesn't depend on m or n, we can consider it as constant.
So overall time complexity would be O(n).

Monday, 27 July 2020

How Asynchronous Ajax Calls Get Executed In JavaScript Event Loop

I will keep this post short. There are tons of resources where you can find JavaScript and it's event loop architecture. I am here to clarify few things.
Let's take an example from this resource. It fairly explains the event loop & the event queue used by JavaScript engine. But the question is what happens when an async call is made. We know it will get pushed to event queue once the call is complete or timeout happens. But it is unclear which component does it or how it actually happens. In the resource above, that component is mentioned as "event table".
One basic thing first, Javascript engine is single threaded & the thread will run the event loop. So for example, let's talk about V8 JavaScript engine which empowers Google Chrome browser & Node.js. Now as mentioned in the resource, a setTimeout() function call will go to event table. But what does that mean?
Here is the thing. Every Javascript engine will have a host like browsers in most cases. Or in case of Node.js, it is the whole server app which is more than just V8 engine. Node.js has various C++ libraries to complement it. This host can have other threads that can handle things like doing Ajax call or setting a timer. And these threads are also capable of pushing the tasks once completed to the event queue used in JavaScript event loop.
For browsers, setTimeout() call or Ajax function call is part of Web API. For Node.js, we have C++ library like Libuv. They are outside of JavaScript engine, but can access the event queue. So even though, Javascript engine is single threaded, it can take advantage of the threads of its host.

KMP Algorithm Explained In Plain English

Here I am just trying to explain KMP algorithm in plain english. I will also explain worst time complexity & why it is O(m + n).

We will take two examples, one with no repeatable character in the pattern & another with repeatable characters in the pattern.


Text:    aaaabaabab

Pattern: aaaaa


First, we need to build a proper prefix suffix table. Let me explain the basic idea. We will take each substring of the pattern.

Here these are "a","aa", "aaa", "aaaa". We are skipping "aaaaa" as that is the whole pattern. If we get a match for that, we are done.

For each of this substring, we will check if there are some consecutive characters which are present in the beginning & also at the end.

So why do we need that? Idea is simple. Suppose, we matched "aaaa", then we had a mismatch. We know "aaa" is common in the beginning & at the end.

As we matched last three "a", we know that starting three "a" are already matched. So we can skip them when we start matching again & directly go to 4th character. Let's see below.

Here is the prefix suffix table for the example above. It's easy to understand. All characters are same. So proper prefix suffix length will be substring length - 1.

index   1 2 3 4

value   0 1 2 3

Now let'e match it.


t[1] = p[1], match length 1, increment both t & p.

t[2] = p[2], match length 2, increment both t & p.

t[3] = p[3], match length 3, increment both t & p.

t[4] = p[4], match length 4, increment both t & p.

t[5] ≠ p[5], match failed, we won't move t in that case, prefix[4] = 3 from the table, so we set p to prefix[4] + 1 = 4.

t[5] ≠ p[4], match failed, we won't move t in that case, prefix[3] = 2 from the table, so we set p to prefix[3] + 1 = 3.

t[5] ≠ p[3], match failed, we won't move t in that case, prefix[2] = 1 from the table, so we set p to prefix[2] + 1 = 2.

t[5] ≠ p[2], match failed, we won't move t in that case, prefix[1] = 0 from the table, as we set p to prefix[1] + 1 = 1.

t[5] ≠ p[1], match failed, p is at the first character, so we can't decrement it & increment only t.

t[6] = p[1], match length 1, increment both t & p.

t[7] = p[2], match length 2, increment both t & p.

t[8] ≠ p[3], match failed, we won't move t in that case, prefix[2] = 1 from the table, so we set p to prefix[2] + 1 = 2.

t[8] ≠ p[2], match failed, we won't move t in that case, prefix[1] = 0 from the table, so we set p to prefix[1] + 1 = 1.

t[8] ≠ p[1], match failed, p is at the first character, so we can't decrement it & increment only t.

t[9] = p[1], match length 1, increment both t & p.

t[10] ≠ p[2], match failed, we won't move t in that case, prefix[1] = 0, from the table, so we set p to prefix[1] + 1 = 1.

t[10] ≠ p[1], match failed, p is at the first character, so we can't decrement it & increment only t, but t is at the last position, so matching ends here without finding the pattern.


So as you can see, t is never decremented. So if text length is n, we have a part of the complexity as n.

But p is decremented. If pattern length is m, is worst case time complexity O(mn)?

No. If you look at the above steps carefully, p was decrement 4 times consecutively after it was matched 4 times. Then p was decremented 2 times after it was matched 2 times. At last it was decremented 1 time after it was matched 1 time.

So basically to have n negative steps, you need n positive steps & n positive steps means you are done traversing through the whole text.

So worst time complexity is n + n = 2n which is O(n).


The time complexity of building proper prefix suffix table is O(m). Probably I will write a separate blog post on that. Here I want to focus on the actual search algorithm of KMP.

So total time complexity of KMP algorithm is O(m + n).


In our example above, we took a pattern with all repeatable characters. Now let's take an example where all characters are different.


Text: aaaabaabab

Pattern: abcde

Prefix suffix table is simple. There is no match among the characters in the pattern. So all values are 0.

index 1 2 3 4

value 0 0 0 0

Now let's match it.


t[1] = p[1], match length is 1, increment both t & p.

t[2] ≠ p[2], match failed, we won't increment t in that case, prefix[1] = 0 from the table, so we set p to prefix[1] + 1 = 1.

t[2] = p[1], match length 1, increment both t & p.

t[3] ≠ p[2], match failed, we won't increment t in that case, prefix[1] = 0 from the table, so we set p to prefix[1] + 1 = 1.

t[3] = p[1], match length 1, increment both t & p.

t[4] ≠ p[2], match failed, we won't increment t in that case, prefix[1] = 0 from the table, so we set p to prefix[1] + 1 = 1.

t[4] = p[1], match length 1, increment both t & p.

t[5] = p[2], match length 2, increment both t & p.

t[6] ≠ p[3], match failed, we won't increment t in that case, prefix[2] = 0 from the table, so we set p to prefix[2] + 1 = 1.

t[6] = p[1], match length 1, increment both t & p.

t[7] ≠ p[2], match failed, we won't increment t in that case, prefix[1] = 0 from the table, so we set p to prefix[1] + 1 = 1.

t[7] = p[1], match length 1, increment both t & p.

t[8] = p[2], match length 2, increment both t & p.

t[9] ≠ p[3], match failed, we won't increment t in that case, prefix[2] = 0 from the table, so we set p to prefix[2] + 1 = 1.

t[9] = p[1], match length 1, increment both t & p.

t[10] = p[2], match length 2, increment both t & p, but t is at the last position, so matching ends here without finding the pattern.


If you notice carefully above, you can see the same thing. To decrement p, we need to match pattern characters at least same number of times beforehand.

So search time complexity is 2n which translates into O(n).

If we take prefix suffix table construction time into account, then total time complexity would be O(m+n).


Sunday, 26 July 2020

Simple Explanation Of Rabin-Karp Pattern Matching Algorithm

In this post, I will try to explain Rabin-Karp Algorithm & its worst time complexity. Rabin-Karp pattern matching algorithm is based on hash matching.
Let's take an example.
Text:    aadabc
Pattern: abc

We will use a simple hash function.
Lets create a table as below:
a b c d e
1 2 3 4 5

So how do we calculate the hash value? We will add up the values as per the table.
Pattern hash is a + b + c = 1 + 2 + 3 = 6

For comparing the text, first we will take 3 characters as our pattern length is 3. Then we we will shift one letter in each step & recalculate hash.
t[1] + t[2] + t[3] = 1 + 1 + 4 = 6
So hash matched. We need to compare each character from 1 to 3 position with the characters in the pattern.
t[1] = p[1], match length 1, increment both t & p.
t[2] ≠ p[2], matching failed, we will match characters from t[2] to t[4].

t[2] + t[3] +  t[4] = 1 + 4 + 1 = 6
So hash matched. We need to compare each character from 2 to 4 position with the characters in the pattern.
t[2] = p[1], match length 1, increment both t & p.
t[3] ≠ p[2], matching failed, we need to match from t[3] to t[5].

t[3] + t[4] + t[5] = 4 + 1 + 2 = 7
Hash matching failed. We need to match from t[4] to t[6].

t[4] + t[5] + t[6] = 1 + 2 + 3 = 6
So hash matched. We need to compare each character from 4 to 6 position with the characters in the pattern.
t[4] = p[1], match length 1, increment both t & p.
t[5] = p[2], match length 2, increment both t & p.
t[6] = p[3], match length 3, so we found our match.

Now let's discuss few drawbacks of above example. When we are shifting one position in the text, we are recalculating the entire hash.
If we have pattern length 20, we will add up 20 values. If we have pattern length 10, we will add 10 values. So it's not constant & time complexity of hash function depends on pattern length m.
So if we use normal hash function, time complexity will be O(mn). That's why we need to use concept of rolling hash function. It uses sliding window technique.
When we shift one position, we will remove first position value from previous hash & add current position value to that. So whatever the value of m is, we can recalculate hash with only two operations.
That is constant & that will bring down time complexity to O(n).

Here is how we can do that.
t[1] + t[2] + t[3] = 1 + 1 + 4 = 6
To get hash for t[2] to t[4], we can do following.
6 - t[1] = 5
5 + t[4] = 6
So, t[2] + t[3] + t[4] = 6

There is one more thing. Whenever there is a hash match, we need to match pattern with that portion of text & verify if characters are same. In the example above, we saw that hash value is same but characters are not.
These false positives will increase time complexity. And if you are using a bad hash function, then you might get false positives in every step. And you might need to compare half of the pattern characters on an average before you find a mismatch.
So time complexity on that case becomes (m / 2) * n or O(mn). That's why Rabin-Karp algorithm has a worst time complexity of O(mn).
We can do some improvement on the hash function that can reduce time complexity to O(m+n) in most of the cases. I will try to write it in some future post.
But I hope you have got a clear understanding of this string searching algorithm & why its worst time complexity is O(mn).

How To Solve "Caused by: org.hibernate.HibernateException: Missing table" When Table Is Present In Database

If you are using JPA or Hibernate directly and got that exception while starting your application, there is one obvious reason for that. You...