04 Apr

Redis Sorted Object Set (Sorted Hashes)

Redis Sorted Object Set (Sorted Hashes)

We all know that best way to store object in Redis is through, Redis hashes. And Redis sorted sets are used to keep a sorted set with a give score for each value. What if we need a sorted set of objects? What we need is a “Sorted set of Objects” . Which Redis does not support by default. We came up with a hybrid, simple data structure that a allows sorted object using Redis Hashes and Redis Sorted-sets.

Read More

01 Jan

Sri Lanka Holidays Calendar 2016 for office365, Google etc. in ics format

Sri Lanka Holidays Calendar 2016 for Outlook & Google

Its really important to track the holidays when you plan work ahead. But most of the online digital calendars that we use these days don’t have in built option to add the holiday calendar for Sri Lanka. Add the Sri Lanka Holidays Calendar 2016 to your digital calendar. Download ICS format SRI-LANKA-HOLIDAYS-2016 .

Adding to office365

  1. Add calendar
  2. From file (don’t go to holiday calendar, Sri Lanka is not available there)
  3. Select file to upload and a calendar that holidays will be added (Create a separate new calendar is recommended. Use RED color)
  4. Save

Read More

10 Oct

Romantic Opera (රෝමාන්තික ඔපෙරාව) By Kasun Kalhara (Lyrics)

Romantic Opera (රෝමාන්තික ඔපෙරාව) By Kasun Kalhara Lyrics

I looked for these lyrics all over the internet and finally i found it in a blog (http://dumbaat.blogspot.com/) and thought it requires a re-blog. This is one of my favorite songs.

හේමන්ත නිද්‍රාවේ, ගිලෙන හැන්දෑවක
නිසල තාවයේ මතුවෙන්නේ
ඔබේ දෑස පත්ලේ ඇවිලෙන, ප්‍රේම අද්‍යාශ යයි

සිහිනේ උමතු නිම්නේ, මා හුදෙකලාවේ නොහෙලා
ආදරේ දෙවඟන හැම රාත්‍රියකම, හැම හිමිදිරියක, හැම දහවලකම
වෙලී ගයන්න එනු මැනා සොප්‍රනෝ හඬින්

රෝමාන්තික ඔපෙරාවේ, හි සියුම් නාද සන් රචනයේ,
හැම ස්වරයකින් ම වෑ හේන්නේ, ඔබේ මගේ නොනිම් ප්‍රේමය යි.

සඳ දිය දහරාවේ, පෙඟෙන ලැමේ දෙයද,
නග්න තලාවේ මතු වෙන්නේ
ඔබේ සියුමැලි සිරුරේ උමතු ආලිංගනය යි

සිහිනේ උමතු නිම්නේ, මා හුදෙකලාවේ නොහෙලා
ආදරේ දෙවඟන සදහට ම එකට,
එක ආදරයක එක මිහිරියාව ක
ගිලී රඟන්න එනු මන දිනුම් මතින්

රෝමාන්තික ඔපෙරාවේ, හි සියුම් රංග වින්න්‍යාස යේ,
හැම රැඟුම කින්ම මැවෙන්නේ,  ඔබේ මගේ නොනිම් ප්‍රේමය යි.

රෝමාන්තික ඔපෙරාවේ, හි සියුම් නාද සන් රචනයේ,
හැම ස්වරයකින් ම වෑ හේන්නේ, ඔබේ මගේ නොනිම් ප්‍රේමය යි.

හැම රැඟුම කින්ම මැවෙන්නේ,  ඔබේ මගේ නොනිම් ප්‍රේමය යි.

 

19 Jul

[Scrum] Adding new priorities to the current sprint

[Scrum] Adding new priorities to the current sprint

We see changes in priorities time to time. With the nature of the team, you might need to expect newer priorities even in the middle of a sprint. New priorities might come as a production issue or a new feature development with higher ROI. I work in a service team where we get high priority items all the time. In this article, I like to talk about how we handle those issues.

What is the ideal way to handle Additional Requirements [real scrum way]

Once a sprint begins, you cannot add requirements or stories to the backlog

It is simple as above; simply you cannot add additional requirements to the sprint once the sprint begins. In addition, you need to keep in mind that if you do not obey at least one rule in scrum, you cannot tell that you are following scrum. Therefore, if you really need to be in pure scrum you need to close the current sprint and prioritize the backlog and start a new sprint with new priority list.

Ideal way doesn’t seems to be efficient for us

As you might already know there are so many hybrid versions of scrum due to scrum been inefficient in some situations. In our case, some team members that get so many ad-hoc tasks (related data base administration). Being a service team, all that members get some amount of ad-hoc tasks. We decided to fork the original scrum and make our own version of it, in search of better efficiency.

Read More

13 Jul

Setting up basic authentication for DBfit with Nginx

Setting up basic authentication for DB fit with Nginx

First you need to install Nginx in the server. I performed this in a Ubuntu server and following commands need to be altered according to the OS if you are not a Ubuntu user. This setup has two main parts. First we will install Nginx and then we will move towards the basic authentication setup for DBfit with Nginx.

Setting up Nginx (extremely straightforward)

This should install Nginx in the Ubuntu machine. But unlucky I got the following error [If you got it installed and started correctly without any errors, you can skip this section]

This happened due to the default file in the sites-available folder. You can simply remove the default file symbolic link in the sites-enabled folder and give another start to the nginx. This time it started without any issue.

Read More

31 May

Continuous data loading from MySQL to Hive [Using Tungsten]

Continuous data loading from MySQL to Hive

Introduction [Skip If needed]

What is Hive

Continuous data loading from MySQL to Hive“The Apache Hive™ data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.”

~ Hive official site

Why Hive

Hive is been used in lot of Major companies including Facebook and Google. Hive excels at real time processing of large amount of data. Any use case that talks about querying large amounts of data in near real time can be benefited by hive. Hive is superior in availability, scalability and manageability. Hive now have capability to store complicated schemas and advance operations like table alteration. If you are a user stuck with a MySQL warehouse and want to get your data into hive for some real time complex querying, it will not be a hard job to replicate the same RDBMS schema in the hive. Following are some real world use cases of hive

  • Bizo: We use Hive for reporting and ad hoc queries.
  • Chitika: for data mining and analysis
  • CNET: for data mining, log analysis and ad hoc queries
  • Digg: data mining, log analysis, R&D, reporting/analytics
  • Grooveshark: user analytics, dataset cleaning, machine learning R&D.
  • Hi5: analytics, machine learning, social graph analysis.
  • HubSpot: to serve near real-time web analytics.
  • Last.fm: for various ad hoc queries.
  • Trending Topics: for log data normalization and building sample data sets for trend detection R&D.
  • VideoEgg: analyze all the usage data

Hope you have enough motivation now let us move into the business.

Process

Continuous data loading from MySQL to Hive

  1. Use Tungsten to get the MySQL bin logs CSV files
  2. Use DDLScan tool to create the Hive Staging and base table structures
  3. Use Map-Reduce job to load base tables from the staging tables.
  4. [Optional] Use bc tool to compare MySQL table and hive table
  5. Setup continuous loading

Read More

08 Apr

Reading and Understanding the Storm UI [Storm UI explained]

Understanding storm UI

I tried to find a document that explains every parameter in the storm UI, but I did not find any. So I thought of starting one of my own. I will start this with whatever information I have by now and will try to make it completes as possible. Thanks to all the forums and sites that help to find these information. Note that this is a live document and you can suggest edits though comments, as this is neither perfect nor complete.

storm_logo_winnerI will cover 4 major views that you can find in the storm UI and go through all the parameters in that view. This might lead to some repetitions, but I ignored that to keep this simple as possible. Before we start you need to know few basic terms. You might need to click and zoom the images as they are unreadable in the default zoom level.

Read More

03 Apr

Split brain issue in elasticsearch- 2 node Cassandra For Titan graph database

Split brain issue in elastic search

I recently configured a 2-node Cassandra cluster with elastic search installed in both nodes creating another 2-node cluster (Cassandra cluster was built using DataStax OpsCenter). Soon after I configured my cluster, I got to know that there are x + 1 issues when we try to get it done with only 2 nodes.

Split brain problem in elastic search

When there are, 2 or more nodes acting like master nodes we call it a split brain. This can happen when one node does not know that it is connected to a cluster with a master. When there is more than 1 master, indexing requests will be completed by both nodes and after some time two copies of the shard are diverged. We might have to do full re-indexing to overcome this issue. If you need more information about the split-brain problem here.

Read More

03 Apr

Building a Cassandra cluster with dataStax OpsCenter [CentOs]

Building a cassendra cluster with dataStax OpsCenter

If you have tried creating a cluster though OpsCenter and failed, do not worry, seems like no one gets it in the first try. When we went through the installation process, we observed that there are issues when installing the dataStax agents though OpsCenter. In addition, we faced lot of issues while installing them. After doing it for several times I came to an agreement with myself of the easiest and working way to install agents and OpsCenter. If you are facing similar issues this article might give you some other options to tryout.

 Issues we faced

  •  OpsCenter cannot create dataStax agents in the cluster nodes [Cluster provisioning failed: Exception: Installation stage failed]
  • OpsCenter cannot connect to the agent installed in other nodes [There was a problem verifying an ssh login]
  • OpsCenter cannot connect to the own node with loopback address
  • OpsCenter cannot reconnect to nodes once restated the cluster
  • OpsCenter cannot connect to nodes with public IPs
  • DataStax agent not running on the node
  • Keep getting message about an old node, which is removed, is running [OpsCenter agent is still running on an old node that was decommissioned or is part of a cluster that OpsCenter is no longer monitoring].

Read More

24 Jan

Moving from md5 towards a more secure hashing technique

Moving from md5 towards a more secure hashing technique

If you have an old website or a web application with some user-base and you have encrypted user passwords with md5, sha1, sha256, etc. With the knowledge about md5 or that, old hashing techniques been compromised with brute force attacks, collision attacks, etc., you are searching for a way to convert your hashing technique to something more secure like bcrypt or crypt_blowfish. This article talks about how to do that.

Read More