Monday, March 24, 2014

REST service in 5 minutes (using Java)

I remember these days when building something similar in Java required much more body movements, and maybe this was the reason to why some start-ups have chosen other weak typing languages with all their fancy Web frameworks for rapid bootstrapping. This isn't the case anymore, see how easy is creating a REST service that supports all CRUD operations in Java:

1. Define your task model:

/**
 * Task model
 */
@Entity
public class Task {
  @Id
  @GeneratedValue(strategy = GenerationType.AUTO)
  private long id;
  private String text;
  private Date created = new Date();
  private Date completed;

  public String getText() {
    return text;
  }

  public void setText(String text) {
    this.text = text;
  }

  public Date getCreated() {
    return created;
  }

  public void setCreated(Date created) {
    this.created = created;
  }

  public Date getCompleted() {
    return completed;
  }

  public void setCompleted(Date completed) {
    this.completed = completed;
  }
}

2. Tell what operations on tasks you're going to support:

/**
 * This class defines DB operations on Task entity
 */
public interface TaskRepository extends PagingAndSortingRepository {
  // Magic method name automatically generates needed query
  public List findByCompletedIsNull();
}

3. Configure your application:

/**
 * This class is responsible for:
 *  - Setting up DB connection and ORM
 *  - Initializing REST service for all found entities
 *  - Starting Spring application (main entry point)
 */
@ComponentScan
@Configuration
@EnableAutoConfiguration
@EnableJpaRepositories
@EnableTransactionManagement
public class Application extends RepositoryRestMvcConfiguration {

  @Bean
  public DataSource dataSource() throws PropertyVetoException {
    MySQLDataSource dataSource = new MySQLDataSource();
    dataSource.setDatabaseName("taskdb");
    dataSource.setUserName("user");
    dataSource.setPassword("pass");
    return dataSource;
  }

  @Bean
  public LocalContainerEntityManagerFactoryBean entityManagerFactory(DataSource dataSource) {
    HibernateJpaVendorAdapter jpaVendorAdapter = new HibernateJpaVendorAdapter();
    // Database tables will be created/updated automatically due to this:
    jpaVendorAdapter.setGenerateDdl(true);
    jpaVendorAdapter.setDatabase(Database.MYSQL);

    LocalContainerEntityManagerFactoryBean entityManagerFactoryBean = new LocalContainerEntityManagerFactoryBean();
    entityManagerFactoryBean.setDataSource(dataSource);
    entityManagerFactoryBean.setJpaVendorAdapter(jpaVendorAdapter);
    entityManagerFactoryBean.setPackagesToScan(getClass().getPackage().getName());
    return entityManagerFactoryBean;
  }

  @Bean
  public PlatformTransactionManager transactionManager() {
    return new JpaTransactionManager();
  }

  public static void main(String[] args) {
    SpringApplication.run(Application.class, args);
  }
}

That's all! After invoking this application, you'll get a task complete REST service for free. Let's test it:

Create a new task:

~$ curl -X POST -H "Content-Type: application/json" -d '{"text":"Implement simplest REST Java application"}' http://localhost:8080/tasks

See the task contents:

~$ curl  http://localhost:8080/tasks/1
{
  "text" : "Implement simplest REST Java application",
  "created" : 1395665199000,
  "completed" : null,
  "_links" : {
    "self" : {
      "href" : "http://localhost:8080/tasks/1"
    }
  }
}

Create another task:

~$ curl -X POST -H "Content-Type: application/json" -d '{"text":"Go home"}' http://localhost:8080/tasks

Find all tasks:

~$ curl  http://localhost:8080/tasks
{
  "_links" : {
    "self" : {
      "href" : "http://localhost:8080/tasks{?page,size,sort}",
      "templated" : true
    },
    "search" : {
      "href" : "http://localhost:8080/tasks/search"
    }
  },
  "_embedded" : {
    "tasks" : [ {
      "text" : "Implement simplest REST Java application",
      "created" : 1395665199000,
      "completed" : null,
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/tasks/1"
        }
      }
    }, {
      "text" : "Go home",
      "created" : 1395665359000,
      "completed" : null,
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/tasks/2"
        }
      }
    } ]
  },
  "page" : {
    "size" : 20,
    "totalElements" : 2,
    "totalPages" : 1,
    "number" : 0
  }
}
(pay an attention to how easy is it implementing pagination using this REST service!)

Mark the first task as complete:

~$ curl -X PATCH -H "Content-Type: application/json" -d "{\"completed\":$(($(date +%s)*1000))}" http://localhost:8080/tasks/1

Find incomplete tasks:

~$ curl  http://localhost:8080/tasks/search/findByCompletedIsNull
{
  "_embedded" : {
    "tasks" : [ {
      "text" : "Go home",
      "created" : 1395665359000,
      "completed" : null,
      "_links" : {
        "self" : {
          "href" : "http://localhost:8080/tasks/2"
        }
      }
    } ]
  }
}

Pretty easy and yet powerful, huh?

For more information and instructions for how to compile and run this application, see source code on GitHub.

Tuesday, February 04, 2014

Parallelization monster framework for Pentaho Kettle

We always end up with ROFL in our team, when trying to find a name for strange looking ETL processes diagrams. This monster has no name yet:


This is a parallelization framework for Pentaho Kettle 4.x. As you probably know in the upcoming version of Kettle (5.0) there's native ability to launch job entries in parallel, but we haven't got there yet.

In order to run a job in parallel, you have to call this abstract monster job, and provide it with 3 parameters:

  • Path to your job (which is supposed to run in parallel).
  • Number of threads (concurrency level).
  • Optional flag that says whether to wait for completion of all jobs or not.
Regarding the number of threads, as you can see the framework supports up to 8 threads, but it can be easily extended.

How this stuff works. "Thread #N" transformations are executed in parallel on all rows copies. Rows are split then, and filtered in these transformations by the given number of threads, so only a relevant portion of rows is passed to the needed job (Job - Thread #N). For example, if the original row set was:

           ["Apple", "Banana", "Orange", "Lemon", "Cucumber"]

and the concurrency level was 2, then the first job (Job - Thread #1) will get the ["Apple", "Banana", "Orange"] and the second job will get the rest: ["Lemon", "Cucumber"]. All the other jobs will get an empty row set.

Finally, there's a flag which tells whether we should wait until all jobs are completed.

I hope one will find attached transformations useful. And if not, at least help me find a name for the ETL diagram. Fish, maybe? :)

Sunday, January 26, 2014

"Be careful when using or accessing WiFi connection"

Last week my Skype account was hacked during my weekend holidays in Budapest. I don't know how this has happened - I only know that I was logged into Skype from iPhone, and I used a lot of free public WiFi, which are abundant in Budapest. The last day of my journey I tried to call out from Skype, and the call was finished too quickly, which should not have happened, since I remembered there was a ~30 bucks deposit on my account. I checked my account, and I've found a lot of calls to Belarus, which I didn't make of course:


There were more (tens) of entries like this.

The next thing I did was logging out from Skype iPhone app, and changing my password. Then I contacted Skype support, and I've got a Web chat with support engineer. I must say, their support reacted immediately to my request, which looked really professional from their side. I chatted about half an hour from my mobile phone's browser, but finally I've got a refund for all the calls I never did.

The incident is over now (actually, it was over the hour after I realized that my account was hijacked), but it raises the question: "How is that possible that my account was hacked? Is there some insecure part in Skype connection from the iPhone app, like sending credentials over non encrypted channel?". Unfortunately, I've got no answer from the support engineer, except for some funny comments/advises (Postfactum, I've read Skype security evaluation, but I haven't find anything that explains this incident either). Below are selected parts from the chat transcript:

Donald M: Michael, we understand that you would like to have your Skype account secured while using the application. 
Donald M: We’d be more than happy to assist you and provide you the best practice to keep you secured.
Donald M: To help you stay secure, we would like to share with you some useful tips and information about online security:
You: Ok, what can I do to keep the account secure?
Donald M: Please visit this link: http://www.skype.com/en/security/

I've read everything on that page, but I didn't find anything useful except for choosing strong enough password (which was strong enough).

Donald M: We strongly advise that every customer installs sufficient security software, such as an antivirus and a firewall on all their devices that use Skype and to keep them enabled and up to date.
You: Antivirus on iPhone?
Donald M: Skype does its best to keep your communication and personal information secure. 
Donald M: Yes!
Donald M: However, please be aware that Skype users should also take precautions against security threats by not sharing their private data and should install adequate security software on all their devices that use Skype.
You: There's no antivirus software on for iPhone mobile phone

Previously, I explained to the support engineer that I use Skype solely from my mobile phone.

Donald M: Yes and be careful when using or accessing Wifi connection. 

This last sentence simply killed me. What can I do when using public WiFi? Maybe wrap my iPhone into a condom?


Thursday, January 16, 2014

Python code indentation nightmare

After numerous hesitations, overcoming my intuitive distaste of Python as programming language, I finally decided to learn it. This is not just for getting familiar with the core coolness of Python and getting myself into Python mainstream, but more for not finding myself becoming a Java-mastodon (a-la COBOL-mastodon or FORTRAN-mastodon) in the next 10 years. So, I created a little project, and started to push lines of code into it.

My first experience with Python has not changed anything about my bad attitude towards Python syntax:
  • Code structures simply don't look fine without opening and closing curly braces, everything looks like a big unstructured mess.
  • Doc-strings written as a comment in the body(!), where I can write any crap I want simply don't feel like an API documentation to a class/method/function.
  • """, which can be used either as a multi-line comment separator or ... as a multi-line string separator when assigned to a variable. Isn't it weird? One of my university professors used to say: "If you wan't to confuse a man, either call two different things with the same name, or call two equal things with different names", and I totally agree with him.
  • Two forms: "import thing" and "from thing import another_thing". Why do I need both of them? Why can't I just use: "import thing.another_thing" or "import thing [another_thing]" like in Perl?
  • Indentation...

Indentation worth another post. I've spent about an hour trying to understand why Unit test, which I added to a forked project's test suite doesn't run:

class TestSchemes(TestCase):
    ...
    def test_find(self):
        work = Scheme.find('wlan0', 'work')
        assert work.options['wpa-ssid'] == 'workwifi'

 # Added my test:
 def test_mytest(self):
  work = Scheme.find('wlan0', 'work')
  work.delete()
  work = Scheme.find('wlan0', 'work')
  assert work == None

Trying to run - my test doesn't run, and there's no single error. I thought, may be test methods are registered somewhere (unlikely, but who knows..), but this wasn't the case. Maybe test_mytest is some registered name in Python? Tried another method names, but with no luck. Finally, I tried one more thing: copied and pasted one of the existing method's declarations using Vim's 'yy' and 'p' shortcuts, then renamed the pasted method name, and voila! That worked! Hm... What's the difference between:

    def test_mytest(self):
        work = Scheme.find('wlan0', 'work')
        work.delete()
        work = Scheme.find('wlan0', 'work')
        assert work == None

and:

 def test_mytest(self):
  work = Scheme.find('wlan0', 'work')
  work.delete()
  work = Scheme.find('wlan0', 'work')
  assert work == None

Right.. this is white-space against tab. I always indent my code using tabs, so the method I added was also indented using tabs. As it turns out, the original code was indented using spaces, so the new method simply wasn't recognized until I replaced all tabs with spaces. This is really really weird situation, when indentation characters have an effect on execution flow. Don't you agree?

To complete, I'd like to write a couple of warm words about Python. There are plenty of frameworks and libraries written in Python. GitHub is teeming with lots and lots of interesting and fun projects, from which you can learn. If you want to build a fast prototype, it's awesome. Not as awesome as Java + Maven, but still :-) I'm mastodon...

May the source be with you.

Thursday, December 09, 2010

Having fun with NodeJS

This is rather a memo for myself than tutorial of building a simple Web site using server-side JavaScript (read why you may need this).

I have registered a domain name, and purchased a 'Level 1' VPS hosting from HostGator (with pre-installed CentOS 5.5), but you can try this on a virtual machine running on your desktop as well.

1. Let's start from installing node.js:

# Download the source package:
wget http://nodejs.org/dist/node-v0.2.5.tar.gz

# Build & Install:
tar -zxf node-v0.2.5.tar.gz
cd node-v0.2.5
./configure
make
sudo make install

2. Now, we need to install a package manager for node.js:

# It's recommended to run npm from a regular user, so we just make the /usr/local/ directory writable by the group 'wheel', and add the relevant user to this group:

sudo chgrp -R wheel /usr/local/{share/man,bin,lib/node}
sudo usermod -g -G wheel $USER

# Install npm:
curl http://npmjs.org/install.sh | sh

3. As developer of node.js himself states, node.js isn't production ready yet. What he suggests is running node.js behind a reverse proxy, served by nginx, for example.

a) Installing nginx is very easy on CentOS:

# Add EPEL repository:
rpm -i http://download.fedora.redhat.com/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm

# Install 'nginx':
yum install nginx

b) Let's configure nginx proxy:

# First, comment out the server {...} section in /etc/nginx/nginx.conf:

vim /etc/nginx/nginx.conf

# Second, add the proxy configuration to /etc/nginx/conf.d/virtual.conf (we suppose that your node.js application will be running on port 8000):

cat <<EOF >> /etc/nginx/conf.d/virtual.conf
upstream app_cluster_1 {
        server 127.0.0.1:8000;
}

server {
        listen 0.0.0.0:80;
        server_name node.local node;
        access_log /var/log/nginx/node.log;

        location / {
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
          proxy_set_header Host $http_host;
          proxy_set_header X-NginX-Proxy true;

          proxy_pass http://app_cluster_1/;
          proxy_redirect off;
        }
}
EOF

4. Create your simple Web application:

a) We will use ExpressJS as a Web framework, so we need to install it first:

npm install express

b) Let's install templates engine as well. I've tried to use Haml template engine, but I couldn't make it render pages properly, so I was forced to move to using ejs (and I don't regret that):

npm install ejs

c) Create the application folders:

mkdir /var/www/apps/your_app

# Static content will be placed here:
mkdir /var/www/apps/your_app/static

# Views (templates) folder:
mkdir /var/www/apps/your_app/views

Here are files that we are going to create under these folders:

/var/www/apps/your_app/app.js

var sys = require('sys');
var express = require('express');
var app = express.createServer();

app.configure(function(){
        app.use(express.methodOverride());
        app.use(express.bodyDecoder());
        app.use(app.router);
        app.use(express.staticProvider(__dirname + '/static'));
        app.set('views', __dirname + '/views');
        //app.use(express.errorHandler({ dumpExceptions: true, showStack: true }));
});

var site_locals = {
        copyright: 'Copyright @ Michael Spector 2010',
};

app.get('/', function(req, res){
        res.render('hello.ejs', {
                locals: { site: site_locals },
        });
});

app.listen(8000);

/var/www/apps/your_app/views/layout.ejs

<html>
        <head>
                <title>My App</title>
                <link rel="stylesheet" href="/style.css" />
        </head>
        <body>
              <!-- Pay an attention to this special construct that will be replaced with the actual view contents (hello.ejs in our case): -->
              <%- body %>
              <hr/>
              <%= site.copyright %>
        </body>
</html>

/var/www/apps/your_app/views/hello.ejs

<h1>Hello, World!<h1/>

/var/www/apps/your_app/static/style.css

body {
   text-align: center;
}

5. The last this is making sure that if node.js unexpectedly dies, it will be started again automatically. For this purpose we will install monit:

# Install monit:
yum install monit

# Uncomment "set httpd" entires in the main configuration file:
vim /etc/monit.conf

# Create the configuration file for your application (note that we define NODE_ENV=production variable prior running node.js, which should enable all production features, like caching, etc...):

cat <<EOF > /etc/monit.d/your_app
check host objdump with address 127.0.0.1
    start program = "/bin/sh -c 'NODE_ENV=production /usr/local/bin/node /var/www/apps/your_app/app.js'"
        as uid nobody and gid nobody
    stop program  = "/usr/bin/pkill -f 'node /var/www/apps/your_app/app.js'"
    if failed port 8000 protocol HTTP
        request /
        with timeout 10 seconds
        then restart
EOF

6. Finally, configure all services to start automatically when system boots, and start them:

sudo chkconfig nginx on
sudo chkconfig monit on

/etc/init.d/nginx restart
/etc/init.d/monit restart

monit stop your_app
monit start your_app

If you need to restart your application upon re-deployment, run:

monit restart your_app

Go to http://<your-ip>/, and have fun!

Hope this tutorial helps you.

Thursday, November 25, 2010

Run FindBugs from your Eclipse RCP headless build

Running FindBugs from Eclipse RCP headless build is pretty much simple:

1. Add the following target to your customTargets.xml (replace "com.yourcompany" with your package/plug-in prefix):



  
  
  
  
    
      
    
  
  
  
  
    
      
      
    
  

  
  
    
      
    
  

  
  
    
      
    
    
  

  
  

    
      
        
      
    
  

  
  
   
    
    
    
  


2. Create input filter file (findbugs-filter.xml):


  
    
  


3. Invoke "findbugs" target from the "prePackage" target:



  
  
  
  
  
    
      


4. Make sure environment variable FINDBUGS_HOME points to the installation of FindBugs.

5. (For Hudson users) Install, and configure FindBugs plug-in to get the fancy "bugs" trend graph :-)

Saturday, October 09, 2010

Headless testing of RCP application

A week ago or so I needed to add a Unit Test invocation as part of my Eclipse RCP application headless build. My original PDE build configuration consisted of a single target for .product with "runPackager=true". I was too lazy to create an additional one for the test feature that contains Unit Tests plug-ins (which would cost me in a longer build time, BTW).

So, I decided to include the test feature in a .product, and remove dependency on it after all tests were executed. This simple piece of Ant code does whole the magic:



It needs to be executed twice: once in a prePackage target, and another time in a postBuild target in order to fix the resulted p2 repository as well.