Developer’s Journal: First Steps into the SAP HANA World


Introduction

A long time ago when I first started blogging on SDN, I used to write frequently in the style of a developer journal. I was working for a customer and therefore able to just share my experiences as I worked on projects and learned new techniques. My goal with this series of blog postings is to return to that style but with a new focus on a journey to explore the new and exciting world of SAP HANA.

At the beginning of the year, I moved to the SAP HANA Product Management team and I am responsible for the developer persona for SAP HANA. In particular I focus on tools and techniques developers will need for the upcoming wave of transactional style applications for SAP HANA.

I come from an ABAP developer background having worked primarily on ERP; therefore my first impressions are to draw correlations back to what I understand from the ABAP development environment and to begin to analyze how development with HANA changes so many of the assumptions and approaches that ABAP developers have.

Transition Closer to the Database

My first thought after a few days working with SAP HANA is that I needed to seriously brush up on my SQL skills. Of course I have plenty of experience with SQL, but as an ABAP developer we tend to shy away from deeper aspects of SQL in favor of processing the data on the application server in ABAP. For ABAP developers reading this, when was the last time you used a sub-query or even a join in ABAP? Or even a select sum? As ABAP developers, we are taught from early on to abstract the database as much as possible and we tend to trust the processing on the application server where we have total control instead of the “black box” of the dbms. This situation has only been compounded in recent years as we have a larger number of tools in ABAP which will generate the SQL for us.

This approach has served ABAP developers well for many years. Let’s take the typical situation of loading supporting details from a foreign key table. In this case we want to load all flight details from SFLIGHT and also load the carrier details from SCARR. In ABAP we could of course write an inner join:

However many ABAP developers would take an alternative approach where they perform the join in memory on the application server via internal tables:

This approach can be especially beneficial when combined with the concept of ABAP table buffering. Keep in mind that I’m comparing developer design patterns here, not the actual technical merits of my specific examples. On my system the datasets weren’t actually large enough to show any statistically relevant performance different between these two approaches.

Now if we put SAP HANA into the mixture, how would the developer approach change? In HANA the developer should strive to push more of the processing into the database, but the question might be why?

Much of the focus on HANA is that it is an in-memory database. I think it’s pretty easy for most any developer to see the advantage of all your data being in fast memory as opposed to relatively slow disk based storage. However if this were the only advantage, we wouldn’t see a huge difference between processing in ABAP. After all ABAP has full table buffering. Ignoring the cost of updates, if we were to buffer both SFLIGHT and SCARR our ABAP table loop join would be pretty fast, but it still wouldn’t be as fast as HANA.

The other key points of HANA’s architecture is that in addition to being in-memory; it is also designed for columnar storage and for parallel processing. In the ABAP table loop, each record in the table has to be processed sequentially one record at a time. The current version of ABAP statements such as these just aren’t designed for parallel processing. Instead ABAP leverages multiple cores/CPUs by running different user sessions in separate work processes. HANA on the other hand has the potential to parallelize blocks of data within a single request. The fact that the data is all in memory only further supports this parallelization by making access from multiple CPUs more useful since data can be “fed” to the CPUs that much faster. After all parallization isn’t useful if the CPUs spend most of their cycles waiting on data to process.

The other technical aspect at play is the columnar architecture of SAP HANA. When a table is stored columnar, all data for a single column is stored together in memory. Row storage (as even ABAP internal tables are processed), places data a row at time in memory.

This means that for the join condition the CARRID column in each table can be scanned faster because of the arrangement of data. Scans over unneeded data in memory doesn’t have nearly the cost of performing the same operation on disk (because of the need to wait for platter rotation) but there is a cost all the same. Storing the data columnar reduces that cost when performing operations which scan one or more columns as well as optimizing compression routines.

For these reasons, developers (and especially ABAP developers) will need to begin to re-think their applications designs. Although SAP has made statements about having SAP HANA running as the database system for the ERP, to extract the maximum benefit of HANA we will also need to push more of the processing from ABAP down into the database. This will mean ABAP developers writing more SQL and interacting more often with the underlying database. The database will no longer be a “bit bucket” to be minimized and abstracted, but instead another tool in the developers’ toolset to be fully leveraged. Even the developer tools for HANA and ABAP will move closer together (but that’s a topic for another day).

With that change in direction in mind, I started reading some books on SQL this week. I want to grow my SQL skills beyond what is required in the typical ABAP environment as well as refresh my memory on things that can be done in SQL but perhaps I’ve not touched in a number of years. Right now I’m working through the O’Reilly Learning SQL 2nd Edition by Alan Beaulieu. I’ve found that I can study the SQL specification of HANA all day, but recreating exercises forces me to really use and think through the SQL usage. The book I’m currently studying actually lists all of its SQL examples formatted for MySQL. One of the more interesting aspects of this exercise has been adjusting these examples to run within SAP HANA and more importantly changing some of them to be better optimized for Columnar and In-Memory. I think I’m actually learning more by tweaking examples and seeing what happens than any other aspect.

What’s Next

There’s actually lots of aspects of HANA exploration that I can’t talk about yet. While learning the basics and mapping ABAP development aspects onto a future that includes HANA, I also get to work with functionality which is still in early stages of development. That said, I will try and share as much as I can via this blog over time. Already in the next installment I would like to focus on my next task for exploration – SQLScript.

 

 

 

23 responses to “Developer’s Journal: First Steps into the SAP HANA World”

  1. Arun Bala says :

    Good post for ABAPers who want to venture the DB space, and explore the 2 world’s. Great post! Regards, @openingbrace

  2. Anjan says :

    Hi,
    Good overview… 🙂
    Can you please give a bit more detailed on SQL Scripts.

    I think we should maintain balance between data to be load to in-memory with respect to DB operations.
    I think we should push the operations to DB incase, less data needs to be sent back to the application where communication between database and application contributes a less significant amount of time to the overall processing time.

    Thank you,
    Anjan…

  3. Thomas Jung says :

    @Anjan – SQLScript is very next on my blogging to-do list. I have a whole scratch pad page full of blogging ideas. The only downside is that even though blogs are pretty informal style of writing, it still takes a day or so to write something like this. That much time is tough to squeeze out of my schedule. I’m going to try for bi-weekly blogs on this topic, but that may at times get interrupted by my travel schedule.

    You comment on reducing the amount of data to the application server is a central one in HANA. The weakest link is generally the network connection between the database and the application server. Therefore it makes sense to reduce the result set as much as possible before returning it the application server. In ABAP in the past we have sometimes sacrificed performance in this transfer by moving large result sets and then making up for it by fast processing via internal tables. In the future you won’t be able to justify this approach as performing those operations at the DB lever will actually be faster.

  4. KK Ramamoorthy says :

    Tom
    Thanks for starting this series. Comes at the right moment with all the buzz on what’s next for HANA. I still think that analytics is the low hanging fruit for HANA but HANA for transactional systems is an interesting option for the near future. BW as we know will not exist with the use of HANA for transactional systems, I guess.
    This series reminds me of your famous BSP series and looking forward to follow this as well.

    Cheers
    KK

  5. Anjan says :

    Really appreciate on your efforts… it helps to get on speed.
    Yes i do agree on your comments.

    Thank You,
    Anjan…

  6. John Irvin says :

    Thomas
    I started my career as an ABAP developer 20 years ago and I’m now working with HANA. I echo your sentiments that this is a total paradigm shift for ABAP developers but one that will be required for the SAP ERP system and SAP customers to take full advantage of HANA.

    Look forward to reading your future blogs on this topic.

    Thank you
    John

  7. Diego Dora says :

    Hi Tom,

    This is a great post, and as an ABAP developer I felt related to “DB Performance dilemma”, and most definitely we will need to bring our “A” game in terms of SQL statements (today is really rusty).

    It is make me think that to become “HANA enable” is going to be a big impact over currently active ERP systems. Similar to the days of those huge conversion projects to make systems Unicode enable – having to create inventories of programs, re-writing code and changing the best practices. Do you think that could be possible to have some programs enable for HANA and some other don’t?

    I’m excited to read the next post, in the meanwhile I’ll order on Amazon the Alan Beaulieu’s book!

    Thanks and Saludos,
    Diego.

  8. Thomas Jung says :

    @ Diego –
    >Similar to the days of those huge conversion projects to make systems Unicode enable

    The difference here will be that you don’t have to make all these changes and updates to objects before you can go live with HANA. You would go with the current code and have roughly the same performance or only a little improvement. You can then phase these enhancements in over time and focus on areas which need the most help of a performance boost.

  9. Diego Dora says :

    @Thomas,

    This is great news! without a doubt it’s going to be easier for me to sell HANA to IT managers and executives since that HANA is not only the future but also the data migration and system readiness is not that invasive and you can actually control how expensive it would be in terms of re-writing apps.

    Thanks!
    Diego.

  10. Berry Funder says :

    wonderfull and interesting post!
    Looking forward to reading your future blogs on this topic.

    Thank you

  11. Jon Reed says :

    Thomas – been meaning to comment for a while but it’s AWESOME to see you back to your blogging roots again.

    Those who are newer on the scene may not appreciate how freakin’ awesome your early SCN ABAP and SAP development blogs were.

    When we had the pick of features for SAP tips back when I was running that in the early 90s I always used to request your stuff, in fact that was how I first learned of your work.

    I’m sure this new round of HANA blogs will be a huge asset..looking forward to more.

    Just a quick question: someone said to me recently that HANA wasn’t programmed in ABAP and it made me wonder – do you know what language HANA was programmed with? I know that obviously SQL is key to the DB functions etc, as well as SQLScript. Just curious as I don’t know the roots of how HANA was built from a dev standpoint though I know some of the in-memory building blocks that were used…

  12. Thomas Jung says :

    @Jon – Thanks for the kind works. It feels really good to be blogging in this style again. Here’s hoping I can keep up with the ambitious schedule which I have set for myself.

    Good question on what language HANA is written in.

    Short Answer: mostly C/C++

    Long Answer: When you get deep down, even ABAP isn’t written in ABAP. The virtual machine, memory management, database interaction layer – all the really deep stuff is implemented as platform specific binaries which we call the Kernel. This layer is all implemented using C/C++.

    Likewise, most of the core code in HANA is also implemented in C/C++. There are a series of binaries which contain the various SQL engines, calculation engine, memory management, repository, etc. There is also a little bit of Python mixed in the engine as well.

    That actually has little bearing on what languages you use to interact or develop on top of HANA. SQL is the main tool used to access data within HANA. We have procedure extension to SQL which we call SQLScript. SQLScript is largely implemented using L-Lang and inside SAP only we are allowed to write directly in L-Lang. We also have Business Functions which are callable via SQLScript which are also implemented in C/C++ (also only SAP internal to write these). Furthermore in SPS4 we will release the functionality to allow the use of R from within SQLScript.

    The XS engine has two runtime containers – one which allows you to develop in C/C++ (SAP internal only) and one based upon the Mozilla Spider Monkey VM for Server Side JavaScript (this will eventually be opened for customer and partner development in the future). In addition we already use SAPUI5 within the XS engine.

    And this doesn’t even cover all the ways to run other applications upon HANA (ODBC/JDBC/ABAP Open SQL/Secondary Database Connection, etc). I’m sure I probably missed a component or two as well, but then again you were probably only looking for the short answer. 🙂

  13. Diego Dora says :

    @Thomas,

    The answer to Jon’s question actually triggered 2 new ones, one could be classified as naive and the second is a regular one:

    1. (naive) During the past 6 years I got to debug/interpret thousands and thousands of standard code – never had a problem understanding it except for every time that you get to a “CALL cfunc”. In a way is like a black hole, you would never know what’s from the other side. So the question is:

    There is any way to get access to the Kernel code in C/C++? – I imagine that the answer is no, but I have to ask :).

    2. Where I can read more about these components and the overall architecture of HANA? I’m looking to go deep into the long answer 🙂

    Thanks again! this is really great info.

  14. Thomas Jung says :

    @Diego
    >There is any way to get access to the Kernel code in C/C++?
    Nope. That’s kept under lock and key in a sub-basement vault in a secret building in Waldorf guarded by unicorns and trolls. 🙂 In all seriousness, there is too much sensitive stuff in there to give out access. I also can’t say I’ve ever felt the need to have to fall a debug down into a kernel method of cfunction. I trust it works otherwise I open a support ticket.

    >Where I can read more about these components and the overall architecture of HANA
    Jeff Word is soon going to publish a book which will cover some of this. We have tons of architectural documents inside SAP as well, although much of these aren’t ready for public consumption. Tomorrow I will have to look around and see which of them are online, publicly. We have a really detailed architectural document that we use internally, but I’m betting it isn’t available outside of SAP because of how much sensitive stuff is in there. But let me see what I can find.

  15. Thomas Jung says :

    Thanks to a comment on Facebook from my co-worker, Ron Silberstein, let me clarify one thing. When I said SAP Internal in the above description I didn’t specify the difference between SAP Internal – but our application teams can use it and SAP Internal – only the core HANA team can use it. We actually do make this distinction within SAP. The C++ container of XS is the later = SAP Internal – only the core HANA team can use it. The Business Functions and L-Lang SQLScript are SAP Internal – but our application teams can use it.

  16. Jeff Word says :

    Hi Thomas

    We’re actually working on a public version of the internal HANA bluebook you mentioned which should address most of the questions above and we’ll have a whole chapter in the HANA Essentials book that deals with app development on a high level.

    What you said is all correct, but really isn’t something that most customers or partners would ever see or even be able to mess around with. Not that HANA is a total black box, but the core engines are not something anyone should ever be messing around with. From a computer science perspective, it might be interesting, but from a practical perspective, its pretty irrelevant to what a developer or admin would ever need to know in the real world.

  17. Thomas Jung says :

    @Jeff Word
    >a practical perspective, its pretty irrelevant to what a developer or admin would ever need to know in the real world

    I assume your referring to my comment answer to Jon and not the blog post itself. Absolutely true. I was addressing those who like to know what’s going on under the hood. Or perhaps in this case a more appropriate analogy would “What’s going on inside the engine block”. However there’s also the point that we have a variety of touch points to HANA in order to support many different channels of access. Although they aren’t all released for customer usage just yet, it speaks to our plans for the openness of HANA.

  18. Diego Dora says :

    @Thomas,

    >That’s kept under lock and key in a sub-basement vault in a secret building in Waldorf guarded by unicorns and trolls.

    I knew about high security protocols but unicorns and trolls is something that I wasn’t expecting… I will talk with Gandalf and get back to you 😉

    >I also can’t say I’ve ever felt the need to have to fall a debug down into a kernel method of cfunction. I trust it works otherwise I open a support ticket.

    I know that is an odd request, but half of my background is GNU/Linux where if there is something wrong with a kernel you just get the source code, fix it, re-compile and you’re ready to GO!
    It’s kind of in my nature want to know more to understand better the context where I’m developing and collaborate to make things even better. But I do understand the reasons why some components aren’t released and most definitely if I see an issue I will open a ticket (after all that’s a way to contribute)

    >Jeff Word is soon going to publish a book which will cover some of this. We have tons of architectural documents inside SAP as well, although much of these aren’t ready for public consumption.

    That’s great! I’m looking forward to read something about it. I also would like to say that my interest in this comes from my passion for the Computer Science. Nevertheless, I have to say that understanding how was built the platform/context/framework/etc where I’m developing let me understand why reacts in the way it does. A Simple example:

    A Binary Search could be code in different ways, knowing how it’s code let you predict and understand at a really detailed level if it’s convenient or not in some particular cases.

    Part of my daily job is to take care of system’s performance and the analysis that we perform are at a really low level. With this I’m not saying that this information should be released by any means – I understand all of the above comments… just that it could be useful in some cases for some extremist like myself :).

    Again, thanks for all the good knowledge that you share! Is great to have people that interacts in such a kind way. Looking forward to read more!

  19. Edgardo says :

    Hi Thomas,
    Is there any way to develop our own application function libraries in C++?

    • Thomas Jung says :

      No. A very small number of partners working directly with SAP are allowed to develop application functions in the C layer. Most requirements should be addressible via SQLScript or JavaScript.

  20. Alice says :

    Dear Thomas

    Graham Robinson asked you a couple of questions for my bachelor thesis at last week’s SAPPHIRE (http://scn.sap.com/community/abap/hana/blog/2013/05/19/answering-a-few-questions-about-hana-for-alice Thank you very much for the answers! That already was a great help!). He mentioned that you might be willing to answer some more of my questions in case I stumble on anything.
    I have also just seen that you have this blog for developers in the HANA world. As one of my topics in my dissertation are custom built applications in HANA, I was wondering if you maybe could answer some of my questions. I’m very sorry if those questions are a bit naïve, but this thesis is actually the first time I really got in touch with SAP / ABAP / HANA etc…

    You mention here that the main difference to developing regular ABAP applications is that for HANA you will have to use more SQL. What would be the main challenges developers have to face when developing specifically for Business Suite on HANA?

    Another question I have is: Is there any easy way to determine how big the effort of re-writing the coding would be? For example, is there a way to estimate what percentage of coding needs to be optimized for HANA in average? I’ve also learned that there’s something called “Performance Guidelines” for ABAP Developing. Do they also have those for ABAP Development for HANA and if so, what do they look like?

    I’ve also seen that you have just finished creating a course for developing for HANA – just to make sure: this is a general course for HANA developers right? Or is there a special one for BW and a special one for Business Suite on HANA? How long would training a regular ABAP programmer to develop applications optimized for HANA take?

    And lastly, how do you evaluate the need to optimize those custom built applications for HANA and how much time and cost should companies plan for the change management of those optimizations?

    Thank you very much in advance and I’m looking forward to hearing from you!
    Alice

    • Graham Robbo says :

      Hi Alice,

      your message only just popped up on my RSS feed this morning yet it is dated May 26. I’m not sure why that is the case but if you haven’t heard from Thomas that might be the trouble. If he has not seen it either it could be because something in borked in the feed from this site.

      To get you started you can got to my site at http://www.yelcho.com.au and you will find on the home page some recent video recordings. The most recent is part 2 of a discussion about HANA developer skills. In this session Thomas and I are joined by Tobias Trapp. Lower down the page you will find part 1 where we are joined by Thorsten Franz.

      You can also go to https://open.sap.com/ and register for the HANA developers course Thomas has built. The final exams are underway so you will probably not get to do them but all the content is freely available for download – which means video, slides and transcripts.

      Cheers
      Graham Robbo

Trackbacks / Pingbacks

  1. SAP HANA, xcode, Cocoa and Swift | EnterpriseGeeks - March 24, 2016

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: