The RowAnalytics Technology ‘Stack’ – Why it Matters

technology stack

Given the abundance of coding languages, open-source versus proprietary options, databases, bug-trackers and software development management tools, it’s easy to get overwhelmed by the abundance of choice. The team at RowAnalytics made those choices based our unique set of challenges. In the field of combinatorial multi-omic analysis, we have demonstrated capabilities to deliver ground-breaking results using a single high-end desktop PC with parallel GPUs. Everything we have achieved to this point has been optimised demonstrating proof-of-concept with a lean start-up mindset.

Last month, Steve participated in a Reddit™ AMA session and was asked by Redditors to shed some light on the various tools we use internally for our technology platforms and why we chose them. Our general approach is to find the best tool to solve the largest problem. Computational efficiency is the toughest challenge we face here at RowAnalytics, and here’s why technology choice matters.

Unlike other Artificial Intelligence (AI) / Machine Learning (ML) start-ups out there, we don’t use standard AI frameworks. For some, this comes as a surprise. However, our experienced team, including our COO who was one of the first AI leaders at NASA nearly 30 years ago, have learned that AI frameworks aren’t particularly efficient when it comes to solving high-dimensional, hyper-combinatorial, multi-modal problems. Standard frameworks don’t perform so well and they are not efficient.

On the front end, we make use of Angular and AngularJS with  D3.js for advanced data visualization.

Our code tends to be held together with Python, which is highly utilized as a programming language and hence, easy to find software developer talent experienced using it. We also use a lot of C++ and CUDA (mainly for linear algebra) for the same reasons, as well as its efficiency. A cautionary approach to avoid bloated code was an important consideration here. Even though we don’t depend on neural nets for our framework, we modelled our deep semantic and combinatorial data mining tools on Latent Semantic Indexing (LSI) principles to take full advantage of GPU architecture which can execute about 800x more instructions per unit of time than a CPU. Designing our high-dimensional combinatorial multi-omics platform to take full advantage of the computational efficiency of a GPU was a mission critical decision.

On the back end, we’ve used Python and Node.js. We have used NoSQL databases like Redis, Aerospike and MongoDB as well as graph databases such as Neo4J and OrientDB. Of course, we also have a couple of relational databases in the mix to support rich contextualization including: MySQL and PostgreSQL. Our platform compiler is written in portable C++ on Windows and Linux with our Runtime implemented in: Java and JavaScript for client-side web browsers, autogenerated from Java source using jSweet, plus C#/.Net and C++.

Our decision support tools are typically deployed on low power IoT devices such as mobile, wearables and smart sensors, so computational efficiency is a priority. We have some Java, some .Net and Python implementations, but it can also get much closer to the metal on things like FPGAs. On top of the basic C++ API, we have APIs for Java, which is provided by a Java Native Interface layer and .Net, which is provided by a C++/CLI layer.

Our personalized digital health mobile apps use the Java API for RESTful Services (JAX-RS) with the Jersey service executed on a Linux server by Apache Tomcat. API documentation is automatically generated by Enunciate and Swagger. IDEs are mainly MS Visual Studio 2015, Anaconda and Eclipse is an IDE to personal preference. Git for version control system and GitHub as a common repository, Jenkins for Continuous Integration server, which compiles and tests builds as changes are committed to GitHub. For our team, the think work tool solutions we rely on are: JIRA/Confluence for project management, JIRA Service Desk for user support / defect management, Zoom / Skype / Dropbox / AWS for communications, compute & file share.

When asked the question “which technology is best?”, the answer is, “…it depends.” Choice depends on the problem you are trying to solve for. In our case, the problem and priority #1 is computational efficiency. The technology stack we selected was critical to maximize computational efficiency for our high-dimensional and complex combinatorial multi-modal analytics.

If you’re curious, the original response on Reddit can be accessed by following this link.

And, if you have questions surrounding any of our technologies please email us at info@rowanalytics.com or comment below.

We’re always open to feedback, Q&A and good dialogue!

By |2018-05-30T20:42:16+00:00May 30th, 2018|blog, Blog Post|0 Comments

About the Author:

Leave A Comment