Monitoring for Ethereum Private Network

Aim: Setup a monitoring website for your local ethereum chain using eth-netstats.

Prerequisites

  1. RPC of a running ethereum blockchain.
  2. git
  3. node
  4. npm

First of all, you need to get one RPC running somewhere. Let’s suppose, you got it running on port 8545 and are allowed to access it using localhost.

Second step would be to setup a monitoring site similar to https://ethstats.net

1. Setting up monitoring site

This will be the site where all the information is displayed. And your blockchain server should be able to access the port at which this site is requesting data. The default one present at cubedro is a bit outdated, and I am going to use my branch which has some updates in regard to displaying geographical location of servers.

https://github.com/AyushyaChitransh/eth-netstats.git
cd eth-netstats
git fetch -va
git checkout patch-1
npm install
sudo npm install -g grunt-cli
grunt all

These commands will install the required things. Now we are ready to spin up our site. I generally use this inside a screen:

# create a new screen named ethstat
screen -S ethstat
PORT=3000 WS_SECRET="apple" npm start
# press ctrl+A+D to come out of screen
# to reattach to screen, use this command:
# screen -dr ethstat

Now we are having an empty site. No data is being displayed yet. Let’s get to data part.

2. Start the blockchain client

You can start blockchain client and you need to take a note of the RPC port and the network port at which peers are communicating. By default, RPC port is 8545 and network port is 30303. If we have left these things to default, we can now proceed to next step.

3. Sending data to the site

We are going to use https://github.com/cubedro/eth-net-intelligence-api.git to send data to the site that we spun up in Step 1. Let’s set this up:

git clone https://github.com/cubedro/eth-net-intelligence-api.git
cd eth-net-intelligence-api
npm install -g pm2

Now we need to edit the eth-net-intelligence-api/app.json file.

It looks like this:

[
  {
    "name"              : "node-app",
    "script"            : "app.js",
    "log_date_format"   : "YYYY-MM-DD HH:mm Z",
    "merge_logs"        : false,
    "watch"             : false,
    "max_restarts"      : 10,
    "exec_interpreter"  : "node",
    "exec_mode"         : "fork_mode",
    "env":
    {
      "NODE_ENV"        : "production",
      "RPC_HOST"        : "localhost",
      "RPC_PORT"        : "8545",
      "LISTENING_PORT"  : "30303",
      "INSTANCE_NAME"   : "",
      "CONTACT_DETAILS" : "",
      "WS_SERVER"       : "wss://rpc.ethstats.net",
      "WS_SECRET"       : "see http://forum.ethereum.org/discussion/2112/how-to-add-yourself-to-the-stats-dashboard-its-not-automatic",
      "VERBOSITY"       : 2
    }
  }
]

We need to update few fields here:

  • RPC_PORT, LISTENING_PORT: If you are not using default, then you can customize it here.
  • INSTANCE_NAME: This would be the name which we want to be displayed as the name of our blockchain server. For example, you can name it Personal Blockchain 1
  • WS_SERVER: This will be the details of the server we setup in Step 1. If you are using the same system, then you can write here localhost. If you have set your monitoring site anywhere else, you need to provide that address. Some of the example values are as follows:
  • WS_SECRET: This shall be the password we set while starting our monitoring site in Step 1. Here we had set as apple

Once all the configurations are done in app.json, we need to start the process which shall send the data to our monitoring site.

cd eth-net-intelligence-api
pm2 start app.json

Some points to take a note of:

Q: Why did we perform step 2 and 3 on the same server?
A: This is to enable proper scaling in future. We could have done step 3 on the same server as of monitoring site of step 1, but in that case, every time we started a new blockchain server, we would require to modify our configurations of Step3. Moreover, the load on our monitoring site would increase when we added more blockchain servers to our monitoring list. So it would be better to club together step 2 and 3 on same server.

Q: Okay, but what if we wanted to keep our blockchain client free from any other stuff? Can we do that?
A: Totally. All we would be required to do is to setup step 3 on a different server and in the config parameter RPC_HOST we would give the address of our blockchain server.

Feel free to ask any other question which comes to your mind. Hope this help you.

Exploring Blockchain

In this post I would be exploring some technologies that I became familiar with in past couple of months. This would include some findings related to blockchain, ethereum, linux server management.

Solidity

This is the language which is  used to write smart contracts for ethereum. In world of Ethereum, the participating individuals interact with each other either directly(to transfer ether) or through contracts. These contract are a set of rules which allows management of ownership of entities. It can be used to accept ether, and generate a new entity.

What is ICO

Now this newly generated entity will also have an associated value, because it was distributed by acceptance of ether. This newly generated entity is called token/coin and ICOs are done to accept ether from other users and give them some coins/tokens in return. If someone plans to invest in ICO, then first see, if the tokens that you receive, are valued? Would the company doing ICO provide you any service in exchange for tokens? Or would that company accept those tokens and give you ether in exchange?

Where can one start?

There are many choices to interact with the ethereum. You need to have a software which can interact with the ethereum blockchain, and in that software, you need to have your account. Here are the choices:

While writing this, I came across a very good reddit thread. This is the best  place for someone to get started with blockchain and also get a grasp of what these things are.

Other Resources

I have come across many collections, but don’t know if there is a centrally curated place for them. Here are some of those findings:

Last one is still being developed and as far as I remember, I came across last one in a gitter channel.

For an in-depth knowledge of all the resources related to blockchain, they are collected in two github repository at Xel/Blockchain-stuff and Scanate/EthList. First one is , and other one is .

Journey towards Blockchain world

Its been about more than a month since I began this journey, and I have discovered new domains in the tech world. In this post I will be discussing about the things like Blockchain. Originally I planned to write about Open Source movement and whatever other stuff I have known in this month too, but blockchain section grew long 😉

Blockchain

Most people associate blockchain with economic sector which relates it to banks, stock markets and crypto-currencies. But Blockchain is a technology, not a product which can only be used in economic sector, it is more than that! Its a world computer, not merely an ICO platform. If you have no idea about what it is, thats better, we will explore what it is.

In simpler terms, it is like recording lifetime of something. Currently its popularly used to record lifetime of currency, i.e, how ownership of currency is being transferred. And I would like to share a beautiful line from A blockchain explanation your parents could understand:

It’s just a way to verify the ownership of something digital, even if there are identical copies… right? If you think about that sentence for long enough, it will already begin to dawn on you how big of a deal that is.

And after this, the author goes on to talk about future of  democracy, future of music, future of energy, future of file storage. It looked very interesting stuff! 😉 make sure to give it a read. Blockchain can be thought of as an incorruptible distributed database.

How a developer started

This is the story of a young boy, who graduated from college and joined a startup. This is not gonna be a short story, so I would save it for another time. For now, I can tell how a developer should start.

History

Blockchain is a technology, and there are many chains running on it. The first one to arrive in this world was the one which introduced blockchain, “Bitcoin”. Bitcoin was introduced to the world by a whitepaper . I think that I heard that it happened after downfall of Centralized Banks in Japan. But right now as I was exploring the mystery behind the identity of author, I couldn’t find any post related to that event, but the story leading to the creation of bitcoin was filled with unsolved mysteries.

Technology Involved

I started with complete in out of ethereum, blockchain, solidity, truffle on Udemy. There I understood what blockchain is and got hands on with developing for blockchain stuff. Here I have described few technologies which are involved to work with blockchain.

  • Blockchain: This is the underlying technology. And probably this is the official site for it.
  • Ethereum: This can be explained as Bitcoin:Blockchain :: Ethereum:Blockchain. While Bitcoin was just a currency, Ethereum can be understood as the platform to develop any type of application that requires blockchain. While I have worked with ethereum, I haven’t explored other alternatives to it. Here are site and github/ethereum. The github repo is the central place for all Ethereum related stuff.
  • Solidity: This is language which is used to write programs which can be deployed on the chain. It requires a different mindset than usual programming. Things done here are non-reversible in nature and every operation has a cost associated.
  • Truffle: development environment, testing framework and asset pipeline for Ethereum, aiming to make life as an Ethereum developer easier.

Till date, I have mostly developed or participated in development of ICO(Initial coin offering) platforms. And I have used few more tools which would be covered in another post as this has already grown long enough 😉

GSoC 2017 Work Product “Reactome Container” @Open Genome Informatics

Table of Contents

Overview

This document describes the work that was done under Google Summer of Code 2017 for organization Open Genome Informatics. The project Idea was to build Stand-alone Reactome server in a Docker image (Reactome).

Reactome is a free, open-source, curated and peer reviewed pathway database. The goal of this project was to produce a Docker image that contains everything that is needed for a user to run Reactome on their own workstation. This includes the web applications, databases, scripts, and other supporting infrastructure components that make up Reactome.

Source Code

Accomplished milestones

  • Prepared a docker image from maven, which builds the applications required by reactome server.
  • Modified two docker images, one for wordpress and one for tomcat and used them to prepare compose file for reactome server.
  • Created solr, neo4j and mysql containers to be included in compose file.
  • Automated the deployment process of reactome server in docker.
  • Provided user guide.

Journey

All the work that I have done has been documented in the blog. Here is a brief description of how I proceeded:

June, coding begins:

  • Found the appropriate base images for our containers
  • Added wordpress site to the container
  • Added tomcat container with logs getting stored at accessible location.

July, major coding period

  • Automated process of building and running all the Java Applications in correct sequence.
  • WordPress links now point to correct java application in tomcat.
  • Added restore process for Neo4j database, solr data and mysql data.
  • Added logging ability to all databases and java applications and various build processes.
  • Automated deployment process of the server inside docker.

August, finalizing things

  • Implemented flags in deploy script
  • Users can now select which application to build
  • Decided application build order
  • Prepared a User Guide

Future Work

  • Reactome releases its database in versions, currently the code is set to run with the files from latest release. Once the Reactome Release process is capable of providing access to old releases, it will be easy to modify this project to use a user-specified version of Reactome data.
  • Configuration of passwords is done manually as of now. In future, users will be able to set the passwords by running a single script and would not need to update more than one file for changing one password.

Impact of GSoC

  • Got an opportunity to make some contribution to Reactome.org
  • Got introduced to many open source communities such as the Maven community, the Docker community, and the community of Stack-Overflow.
  • During this period, I interacted with members of other open source projects and was also introduced to few technical groups on social media, which were very enlightening.
  • I gained a lot of knowledge about technology I worked on, about community interaction and also about the importance of effective communication with other developers working on same project.
  • I realized that a project is not just about the coding part, a lot happens behind the scene to make a project successful.

DAY 77

We have successfully implemented flags in the deploy script, and now our development process has become much smoother. We are now testing all the java applications which were built few weeks ago. We did not encounter any problems at that time because our m2-cache repository had some things already installed during testing. So locally we had some artifacts, while they failed to build when built on a fresh system.

Now we are testing each application one by one, and we came to know about following things:

  • CuratorTools is the first application to be built. It builds reactome.jar And now we are installing this jar file in our local maven repository as soon as it is built.
  • PathwayExchange is to be built next. PathwayExchange requires reactome.jar which was built by CuratorTool.
  • RESTfulAPI should be built after PathwayExchange has been installed in local maven repository. The last step in PathwayExchange installs PathwayExchange locally by executing mvn compile package install.
  • Pathway-Browser can now be installed. It has been cloned from Reactome-pwp.
  • Now we are building Search-Core. Search-core is required by data-content and content service. And Search-core does not have any dependency of itself. And if there, then we have already installed them in previous build.
  • Data-content can be installed after search-core is ready. In data-content-pom.xml, we were specifying that search-core would be found at <reactome.search.core>1.0.1-SNAPSHOT</reactome.search.core> but default pom file installs the search-core atreactome.search.core>1.0.1</reactome.search.core>.
  • We cannot install content-service yet, it requires interactors.db, which will be provided by Interactors-core. Currently it is being tested. Will update this order soon.

A week before submission begins

We are approaching towards the week of submission of GSoC project. What we are working on are: We improved the deployment process. Users can now make use of flags in the deploy script to decide if they want to download data files or update them or build webapps.

Implementing -b select flag was more complicated than others. The easiest way was to give option to user and get those responses saved in the file which is building the webapps. But this did not seemed good. Because when user ran -b select once, those values would get stored and would be used again if user wanted to run the build process again. So now, we are storing the  options entered by user in temporary variables. These variables are getting exported to shell environment in which the script is being run. Docker build file will also be run in continuation and so it would take values of those variables and use them in the environment of maven_build.sh. Maven_builds.sh is the file responsible for building the webapps. So the flow of value of variable of  which apps to build is as follows:

deploy.sh -> shell -> build_webapps.env -> maven_build.sh
build_webapps.sh -> build_webapps.env (it tells which variables to use from shell)
maven_build.sh: Its environment is populated from the docker run statement.

In the process, deploy script has become crucial for the project, as it is managing following tasks:

  • Download and update data files and other big files
  • Extract the compressed files. Okay, it works for old files. But for those files which were updated, it need to be improved.
  • Decide which applications to build and start the build process.
  • Overcome any permission issues with mysql and solr logs.
  • Starting main containers for the server.

In coming days, we will be working on to provide users with feature to add passwords and user name to containers before they use them. Containers like, mysql, tomcat, neo4j, solr, almost all require this thing. And documentation is also to be taken care of. There are few more issues that we need to look into. Its going to be a great week.

DAY 65

There were problems while getting mysql logs to host directory. And they were due to ownership issues. I also tried using long syntax of volume which required vetsion3.2 in docker-compose file. And I thought that maybe the volume type might get it correctly. But this time also files were not created.

I think that the issues I am facing are specific to my system. The main issue is failure of chown command. Ownership remains unchanged after sudo chown in ubuntu and it was because of NTFS file system. Not all users would be storing this project in NTFS filesystem mounted inside their native filesystem.

The problem with chown was specific to my system. But is very likely to happen on other system, provided they are having same setup. So it would be better if we stored logs using the other approach, like creating docker volume internally to docker and then using chown to modify permissions and ownership.

DAY 64

To enable mysql to log to host directory, most promising thing that we could do was to provide permissions to mysql user of container by using chown. As I investigated about this, I found that ls -l /var/log/mysql inside container gave info of userId and groupId which was 999:999 and outside container different userId and groupId 1000:1000.

An alternative to this is using rsyc to keep the log files of mysql in sync with files hosted in different directory. By default, the mysql image does not have rsync, but sync and fsync are present. But using that does not seem feasible option.

Earlier when I was thinking about rsync, I believed that it is some sort of a process which keeps running in background and keeps files synced, the moment they are changed. But later I realized from here that this was done using cron or inotify system of linux in combination of incron. But we don’t want this. Its expensive and I would not prefer this, since there are other ways too.

DAY 63

I am trying to get mysql logs into the host directory. The problem is that mysql does not log when its logs are saved in mounted host directory. And the reason behind this is that mysql is the owner of log files inside container and when host directories are mounted, the directory is not owned to mysql user of container.So no logs are written.

This answer suggested to use internal docker volumes and that would be accessible to mysql and when I checked, the logs were being written inside the internal volumes. The problem in this approach is that internal docker volume are internal, that is, they are inside somewhere and user may find it difficult to look for them.

Another approach which came from another question  Mount logs of mysql container in host directory suggested to modify entrypoint and add permissions of log files to mysql user of container. But it did not seem to work. I am testing only one service, mysql-database and its service description is as follows:

  mysql-database:
    image: mysql
    container_name: mysql-database
    volumes:
      - ./mysql/wordpress_data/:/docker-entrypoint-initdb.d
      - ./mysql/init.sh:/init.sh
      - ./mysql/test.cnf:/etc/mysql/mysql.conf.d/test.cnf
      - ./logs/mysql/wordpress:/var/log/mysql
    entrypoint:  /bin/sh -c "/init.sh && exec /entrypoint.sh mysqld"
    env_file:
      - .env

and test.conf file mentioned here is as follows:

[mysqld]
log-error   = /var/log/mysql/error.log
general_log  = /var/log/mysql/log_output.log

Init.sh is as follows:

touch /var/log/mysql.log
chown mysql:mysql /var/log/mysql.log
touch /var/log/mysql.error.log
chown mysql:mysql /var/log/mysql.error.log

Other files are same as from the repository. For now, I will continue with this and see what the problem is.


I came across a useful section of a script, it opens up browser in host system. I was desiring to implement this behavior in my project too. So here is the script

case "$(uname)" in
"Darwin") open http://localhost:8080
          ;;
"Linux")  if [ -n "$BROWSER" ] ; then
                    $BROWSER http://localhost:8080
            elif    which xdg-open > /dev/null ; then
                    xdg-open http://localhost:8080
            elif    which gnome-open > /dev/null ; then
                    gnome-open http://localhost:8080
            #elif other types blah blah
            else
                    echo "Could not detect web browser to use - please launch URL using your chosen browser ie:  http://localhost:8080 or set your BROWSER variable to the browser launcher in your PATH"
            fi
          ;;
*)        echo "Browser not launched - this OS is currently not supported "
          ;;
esac

This is something which we should implement in our project. users would just run one script and then when they come back to their system, their browsers are ready with what they want.

DAY 62

I am trying to capture log files for mysql. And an answer was helpful in this matter. As I was reading that post, I realized that if we enabled all types of logs then they would become huge in size and users might not know where to turn them off! So I think it would be better to keep logging to error level only. This is the default logging level in mysql.

While capturing mysql logs, the logs show up individually inside the container but when the folder is mounted inside container by host, no logs are created. I confirmed this behavior by testing it in previous container and also in a newly built container, logs were not there when host directory is mounted. Same problem is addressed in this question. Its solution was to use named volume.

When I used volume named as mysql-for-tomcat_logs, I saw that the logs were being written to /var/lib/docker/volumes/container_mysql-for-tomcat_logs/_data . So could it be possible that I use symlinks to provide all logs in one place? But that I think is not a good idea. Because normal user wont have permissions to view the content of /var/lib/docker/volumes/, therefore, even if links are provided, they will not be of any use.

I think we should change the permissions of logs/mysql-for-tomcat while creating logs.


 

Other things to be taken care of:

  • Webapps build order: Some applications are dependent on other applications, so we need to build independent webapps first. While I was deploying the project on a new system, some problems were there which said about other dependencies to be required. On my system , since I was following the sequence as required, so there were no problems, but on a new system, I tried installing content-service and it showed that it required some other things. We would look into it.
  • Remove set -e from maven_builds.sh in the production version. Its presence will cause build  of other apps to fail if one build fails.
  • Re-Unpacking: For solr_data and other tar/gz files, if user wants to delete them, and unpack them again? I think  we have placed the flag in wrong location. It should be inside the directory which gets unpacked. Example, flag of solr_data_unpacked should be inside solr_data.

Today, we were able to capture log files from most services, however, capturing mysql logs is yet to be done. I  began testing data-content and I was puzzled because data-content and content service were giving same application. After a while, I found that there was mistake in mount point and I had mounted content-service in place of data-content.

After data-content was started, I was not able to see any working application and there were no errors on console. I changed the logging location of content-service and data-content so that both were logged at different location. Earlier both were using  same file for logging. Logs were stored in search.db. Tomorrow I’ll try reading this file to see what has gone wrong and try to bring up the application.