Monday, February 09, 2015

JavaScript Stored Procedures and Node.js Applications with Oracle Database 12c

                                      Kuassi Mensah
                                    db360.blogspot.com | @kmensah | https://www.linkedin.com/in/kmensah

Introduction                                                            

Node.js and server-side JavaScript are hot and trendy; per the latest “RedMonk Programming Languages Rankings[1], JavaScript and Java are the top two programming languages. For most developers building modern Web, mobile, and cloud based applications, the ability to use the same language across all tiers (client, middle, and database) feels like Nirvana but the IT landscape is not a green field; enterprises have invested a lot in Java (or other platforms for that matter) therefore, the integration of JavaScript with it becomes imperative. WebSockets and RESTful services enable loose integration however, the advent of JavaScript engines on the JVM (Rhino, Nashorn, DynJS), and Node.js APIs on the JVM (Avatar.js, Nodyn, Trireme), make possible and very tempting to co-locate Java and Node applications on the same JVM.
This paper describes the steps for running JavaScript stored procedures[2] directly on the embedded JVM in Oracle database 12c and the steps for running Node.js applications on the JVM against Orace database 12c, using Avatar.js, JDBC and UCP.
          

JavaScript and the Evolution of Web Applications Architecture                                   

At the beginning, once upon a time, long time ago, JavaScript was a browser-only thing while business logic, back-end services and even presentations where handled/produced in middle-tiers using Java or other platforms and frameworks. Then JavaScript engines (Google’s V8, Rhino) leave the browsers and gave birth to server-side JavaScript frameworks and Node.js.

Node Programming Model

Node.js and similar frameworks bring ease of development rapid prototyping, event-driven, and non-blocking programming model[3] to JavaScript. This model is praised for its scalability and good enough performance however, unlike Java, Node lacks standardization in many areas such as database access i.e., JDBC equivalent, and may lead, without discipline, to the so called “callback hell[4]”.
Nonetheless, Node is popular and has a vibrant community and a large set of frameworks[5].

Node Impact on Web Applications Architecture

With the advent of Node, REST and Web Sockets, the architecture of Web applications has evolved into 
(i) plain JavaScript on browsers (mobiles, tablets, and desktops); 
(ii) server-side JavaScript modules (i.e., Node.js, ORM frameworks) interacting with Java business logic and databases.
The new proposal for Web applications architecture is the integration of Node.js and Java on the JVM.  Let’s discuss the enabling technologies: JavaScript engine on the JVM and Node API on the JVM and describe typical use cases with Oracle database 12c.  

JavaScript on the JVM

Why implement a JavaScript engine and run JavaScript on the JVM? For starters, i highly recommend Mark Swartz ‘s http://moduscreate.com/javascript-and-the-jvm/ and Steve Yegge’s  http://steve-yegge.blogspot.com/2008/06/rhinos-and-tigers.html blog posts. 

In summary, the JVM brings (i) portability; (ii) manageability; (iii) Java tools; (iv) Java libraries/technologies such as JDBC, Hadoop; and (v) the preservation of investments in Java. 

There are several implementations/projects of Java based JavaScript engines including Rhino, DynJS and Nashorn.

Rhino
First JavaScript engine entirely written in Java; started at NetScape in 1997 then, became an open-source Mozilla project[6]. Was for quite some time the default JavaScript engine in Java SE, now  replaced by Nashorn in Java SE 8. 

DynJS
DynJS is another open-source JavaScript engine for the JVM. Here is the project homepage http://dynjs.org/. 

Nashorn
Introduced in Java 7 but “production” in Java 8[7], the goal of project Nashorn (JEP 174), is to enhance the performance and security of the Rhino JavaScript engine on the JVM. It integrates with javax.script API (JSR 223) and allows seamless interaction between Java and JavaScript (i.e., invoking Nashorn from Java and invoking Java from Nashorn).

To illustrate the reach of Nashorn on the JVM and the interaction between Java and JavaScript, let’s run some JavaScript directly on the database-embedded JVM in Oracle database 12c. 

JavaScript Stored Procedures with Oracle database 12c Using Nashorn


Why would anyone run JavaScript in the database? For the same reasons you’d run Java in Oracle database. Then you might ask: why run Java in the database, in the first place? As discussed in my book[8], the primary motivations are: 
(i) reuse skills and code, i.e., which programming languages are your new hire knowledgeable of or willing to learn; 
(ii) avoid data shipping[9] i.e., in-place processing of billions of data/documents; 
(iii) combine SQL with foreign libraries to achieve new database capability thereby extending SQL and the reach of the RDBMS, e.g., Web Services callout, in-database container for Hadoop[10]
Some developers/architects prefer a tight separation between the RDBMS and applications therefore, no programming language in the database[11]but there are many pragmatic developers/architects who run code near data, whenever it is more efficient than shipping data to external infrastructure.

Co-locating functions with data on the same compute engine is shared by many programming models such as Hadoop. With the surge and prevalence of Cloud computing, RESTful service based architecture is the new norm. Data-bound services can be secured and protected by the REST infrastructure, running outside the RDBMS. Typical use case: a JavaScript stored procedures service would process millions/billions of JSON documents in the Oracle database and would return the result sets to the service invoker.

To conclude, running Java, JRuby, Python, JavaScript, Scala, or other programming language on the JVM in the database is a sound architectural choice. The best practices consist in: (i) partitioning applications into data-bound and compute-bound modules or services; (ii) data-bound services are good candidates for running in the database; (iii) understand Oracle database 
DEFINER INVOKER rights [12] and grant only the necessary privilege(s) and/or permission(s). 

The Steps

The following steps allow implementing JavaScipt stored procedure  running in Oracle database; these steps represent an enhancement from the ones presented at JavaOne and OOW 2014 -- which consisted in reading the JavaScript from the file system; such approach required granting extra privileges to the database schema for reading from RDBMS file system something not recommended from security perspective. Here is a safer approach:

1.      Nashorn is part of Java 8 but early editions can be built for Java 7; the embedded JavaVM in Oracle database 12c supports Java 6 (the default) or Java 7. For this proof of concept, install Oracle database 12c with Java SE 7 [13]
2.      Build a standard Nashorn.jar[14]; (ii) modify the Shell code to interpret the given script name as an OJVM resource; this consists mainly in invoking getResourceAsStream() on the current thread's context class loader ; (iii) rebuild Nashorn.jar with the modified Shell
3.  Load the modified Nashorn jar into an Oracle database shema e.g., HR
 loadjava -v -r -u hr/ nashorn.jar
4.      Create a new dbms_javascript  package for invoking Nashorn’s Shell with a script name as parameter
create or replace package dbms_javascript as
  procedure run(script varchar2);
end;
/
create or replace package body dbms_javascript as
  procedure run(script varchar2) as
  language java name 'com.oracle.nashorn.tools.Shell.main(java.lang.String[])';
end;
/

Then call dbms_javascript,run(‘myscript.js’) from SQL which will invoke Nashorn  Shell to execute the previously loaded myscript.js .
5.  Create a custom role, we will name it NASHORN, as follows, connected as SYSTEM
SQL> create role nashorn;
SQL> call dbms_java.grant_permission('NASHORN', 'SYS:java.lang.RuntimePermission', 'createClassLoader', '' );
SQL> call dbms_java.grant_permission('NASHORN', 'SYS:java.lang.RuntimePermission', 'getClassLoader', '' );
SQL> call dbms_java.grant_permission('NASHORN', 'SYS:java.util.logging.LoggingPermission', 'control', '' );
Best practice: insert those statements in a nash-role.sql file and run the script as SYSTEM
6.      Grant the NASHORN role created above to the HR schema as follows (connected as SYSTEM):

SQL> grant NASHORN to HR;

7.      Insert the following JavaScript code in a file e.g., database.js stored on your client machine’s (i.e., a machine from which you will invoke loadjava as explained in the next step).
This script illustrates using JavaScript and Java as it
uses the server-side JDBC driver to execute a PreparedStatement to retrieve the first and last names from the EMPLOYEES table.

var Driver = Packages.oracle.jdbc.OracleDriver;
var oracleDriver = new Driver();
var url = "jdbc:default:connection:";   // server-side JDBC driver
var query ="SELECT first_name, last_name from employees";
// Establish a JDBC connection
var connection = oracleDriver.defaultConnection();
// Prepare statement
var preparedStatement = connection.prepareStatement(query);
// execute Query
var resultSet = preparedStatement.executeQuery();
// display results
     while(resultSet.next()) {
     print(resultSet.getString(1) + "== " + resultSet.getString(2) + " " );
     }
// cleanup
resultSet.close();
preparedStatement.close();
connection.close();


8.      Load database.js in the database as a Java resource (not a vanilla class)
loadjava –v –r –u hr/ database.js

9.      To run the loaded script

sqlplus hr/
SQL>set serveroutput on
SQL>call dbms_java.set_output(80000)
SQL>call dbms_javascript.run(‘database.js’);

The Nashorn Shell reads ‘database.js’ script stored as Java Resource from internal table; the JavaScript in its turn invokes JDBC to execute a PreparedStatement and the result set is displayed on the console. The message “ORA=29515: exit called from Java code with status 0” is due to the invocation of java.lang.Runtime.exitInternal; and status 0 means normal exit (i.e., no error). The fix is to remove that call from Nashorn. 

Node.js on the JVM

As discussed earlier, Node.js is becoming the man-in-the-middle between Web applications front ends and back-end legacy components and since companies have invested a lot in Java, it is highly desirable to co-locate Node.js and Java components on the same JVM for better integration thereby eliminating the communication overhead. There are several projects re-implementing Node.js APIs on the JVM including: Avatar.js, Nodyn, and Trireme. This paper will only discuss Oracle’s Avatar.js.

Project Avatar.js[15]

The goal of project Avatar.js is to furnish “Node.js on the JVM”; in other words, an implementation of Node.js APIs, which runs on top of Nashorn and enables the co-location of Node.js programs and Java components. It has been outsourced by Oracle under GPL license[16]. Many Node frameworks and/or applications have been certified to run unchanged or slightly patched, on Avatar.js.

There are binary distributions for Oracle Enterprise Linux, Windows and MacOS (64-bits). These builds can be downloaded from https://maven.java.net/index.html#welcome. Search for avatar-js.jar and platform specific libavatar-js libraries (.dll, .so, dylib). Get the latest and rename the jar and the specific native libary accordingly. For example: on  Linux, rename the libary to avatar-js.so; on Windows, rename the dll to avatar-js.dll and add its location to your PATH (or use -Djava.library.path=).

RDBMSes in general and Oracle database in particular remain the most popular persistence engines and there are RDBMS specific Node drivers[17] as well as ORMs frameworks. However, as we will demonstrate in the following section, with Avatar.js, we can simply reuse existing Java APIs including JDBC and UCP for database access.

Node Programming with Oracle Database using Avatar.js, JDBC and UCP 

The goal of this proof of concept is to illustrate the co-location of a Node.js application, the Avatar.js library, the Oracle JDBC driver and the Oracle Universal Connection Pool (UCP) on the same Java 8 VM.
The sample application consists in a Node.js application which performs the following actions:
(i) Request a JDBC-Thin connection from the Java pool (UCP)
(ii)Create a PreparedStatement object for “SELECT FIRST_NAME, LAST_NAME FROM EMPLOYEES
(iii)Execute the statement and return the ResultSet in a callback
(iv)Retrieve the rows and display in browser on port 4000
(v) Perform all steps above in a non-blocking fashion – this is Node.js’s raison d’ĂȘtre. The demo also uses Apache ab load generator to simulate concurrent users running the same application in the same/single JVM instance.For the Node application to scale in the absence of asynchronous JDBC APIs, we need to turn synchronous calls into non-blocking ones and retrieve the result set via callback.

Turning Synchronous JDBC Calls into Non-Blocking Calls

We will use the following wrapper functions to turn any JDBC call into a non-blocking call i.e., put the JDBC call into a thread pool and free up the Node event loop thread.
var makeExecutecallback = function(userCallback) {
 return function(name, args){
      ...
      userCallback(undefined, args[1]);
  }
}
 function submit(task, callback, msg) {
    var handle = evtloop.acquire();
    try {    var ret = task();
               evtloop.post(new EventType(msg, callback, null, ret)); {catch{}
    evtloop.submit(r);
}

Let’s apply these wrapper functions to executeQuery JDBC call, to illustrate the concept
exports.connect = function(userCallback) {..} // JDBC and UCP settings
Statement.prototype.executeQuery = function(query, userCallback) {
         var statement = this._statement;
          var task = function() {
          return statement.executeQuery(query);
       }
     submit(task, makeExecutecallback(userCallback), "jdbc.executeQuery");
}
Similarly the same technique will be applied to other JDBC statement APIs.
Connection.prototype.getConnection = function() {…}
Connection.prototype.createStatement = function() {..}
Connection.prototype.prepareCall = function(storedprocedure) {..}
Statement.prototype.executeUpdate = function(query, userCallback) {..}

Returning Query ResultSet through a Callback

The application code fragment hereafter shows how: for every HTTP request: (i) a connection is requested, (ii) the PreparedStatement is executed, and (iii) the result set printed on port 4000.
   ...
   var ConnProvider = require('./connprovider').ConnProvider;
var connProvider = new ConnProvider(function(err, connection){.. });

var server = http.createServer(function(request, response) {
  connProvider.getConn(function(name,data){..});     
  connProvider.prepStat(function(resultset) {
                while (resultset.next()) {
                   response.write(resultset.getString(1) + " --" + resultset.getString(2));
                   response.write('
');
    }
    response.write('
');
    response.end();   
}
server.listen(4000, '127.0.0.1');

Using Apache AB, we were able to scale to hundreds of simultaneous invocations of the Node application. Each instance grabs a Java connection from The Universal Connection Pool (UCP), executes the SQL statements through JDBC then return the result set via a Callbak on port 4000.

Conclusions

As server-side JavaScript (typified by Node.js) gains in popularity it’ll have to integrate with existing components (COBOL is still alive!!). Developers, architects will have to look into co-locating JavaScript with Java, across middle and database tiers.






[1] http://redmonk.com/sogrady/2015/01/14/language-rankings-1-15/
[2] I’ll discuss the rationale for running programming languages in the database, later in this paper.
[3] Request for I/O and resource intensive components run in separate process then invoke a Callback in the main/single Node  thread, when done.
[4] http://callbackhell.com/
[5] Search the web for “Node.js frameworks
[6] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/Rhino
[7] Performance being one of the most important aspect
[8] http://www.amazon.com/exec/obidos/ASIN/1555583296
[9] Rule of thumb: when processing more than ~20-25% of target data, do it in-place, where data resides (i.e., function shipping).
[10] In-database Container for Hadoop is not available, as of this writing.
[11] Other than database’s specific procedural language, e.g.,  Oracle’s PL/SQL
[12] I discuss this in chapter 2 of my book; see also Oracle database docs.
[13] See Multiple JDK Support in http://docs.oracle.com/database/121/JJDEV/E50793-03.pdf
[14] Oracle does not furnish a public download of Nashorn.jar for Java 7; search “Nashorn.jar for Java 7”.
[15]  https://avatar-js.java.net/
[16] https://avatar-js.java.net/license.html
[17] The upcoming Oracle Node.js driver was presented at OOW 2014.