DaveWentzel.com            All Things Data

October 2013

Will MS Licensing Drive Customers to NoSQL?

Will MS Licensing Drive Customers to NoSQL?

Yep.  But that's just my opinion.  I have been asked to assist my client with evaluating various alternatives to SQL Server for an enterprise-class, multi-TB transactional database.  Both the client and me have very realistic expectations...we don't think we will find a good alternative, but we can always begin rolling out new features under a second DBMS.  Clearly if we stayed relational then something like PostgreSQL or MySQL would be great choices.  But perhaps now is the time to think about alternative data organization paradigms that are not relational.  Why not a network model or graph model?  The next bunch of blog posts I'm going to do are my experiences working hands-on with various NoSQL solutions.  

For this post I want to talk about Microsoft practices that I believe are pushing more customers towards SQL Server alternatives.  SQL Server 2012 now has an option for core licensing for Standard and Enterprise.  You must buy 4 cores at a minimum and additional packs are a minimum of 2 cores.  Core licensing gets a little hairy for VMs.   Each VM CPU is considered a core, and the 4 core minimum still applies.  However, if you have Ent with Software Assurance and license all cores on the host then you can run unlimited VMs with SQL Server configured however you like.  

Prior to SQL Server 2012, the processor license cost was kinda high, so the accepted strategy was to get as much performance and capacity as you could afford for each expensive processor socket license that you purchased. In other words, max out each socket.  But socket licensing is gone.  With Enterprise Edition, which is what I see in-use almost exclusively at my customers, you only have core-based licensing.  Standard and BI Edition have "Server + CAL"-based licensing as well.  But BI Edition does not have core-based licensing.  Got all that?  Yeah yeah, Microsoft eased some confusion by ditching Datacenter Edition, so I guess it's a wash.  And there's still Web Edition for hosting providers only.  Whew.  

But again, my guess is core-based licensing is what most enterprise customers are choosing.  

Frankly I don't see what the point of Standard Edition is.  In both 2008 R2 and 2012 the RAM limit is 64GB and 16 physical cores.  The RAM limit is the killer.  Right now my laptop has 32GB of RAM.  RAM is so dirt cheap that anyone running Standard with less than 64GB is, frankly, nuts.  The licensing costs for the software vs the price for the RAM is just that skewed.  Here I'm a fan of using Microsoft's rules to my advantage.  If you have, say, 128GB RAM in your server, why not run two instances of SQL Server with the same core-based licensing?  Granted, you'll need some kind of "sharding" strategy to do this, but isn't it time you had this anyway?  

So that means Enterprise Edition is going to be what, I believe, most enterprise customers will deploy.  And I gotta say, online index operations and data compression, by themselves, easily justifies the cost difference.  And then if you really need AlwaysOn availability groups, Enterprise is your only choice.  

Let's get back to the mathematics of core-based licensing.  Again, in the past the best practice was to pay the price premium for the absolute best processor available for each socket in your database server.  That may kill your budget with SQL 2012 EE.  In core-based licensing each physical socket in your server must use a minimum of four core licenses. Let's say you have some old dual-core processor hardware lying around...I really wouldn't run EE on that since you'll need to pay MS for four core licenses for each socket.  So if your intention is to recycle old hardware and maybe just upgrade it from SQL 2008 to 2012, you might want to rethink that.  Granted, this kind of hardware is probably 5 years old anyway and should be retired anyway.  

So, what are some LEGAL ways to cut down on licensing fees?

  • Use Standard Edition and forego the EE features.  
  • You can still use lots of RAM, but you'll need to run multiple SE instances 
  • Use SE and any RAM over 64GB consider using a RAMdrive.  For instance, put tempdb on your RAMdrive in any physical RAM you may have over and above 64GB.  Microsoft even graciously gave us a KB article about it.  
  • Pay nothing and use Express Edition.  Granted, you are limited to about 1.5GB of RAM and 10GB databases, but if you are CAREFUL and shard properly this can actually be viable.  I worked for a company that sold software run entirely on Express and we had "databases" that were well over 100GB because we used sharding and FILESTREAM effectively.  I wrote about this in my blog post Express vs Standard and How to save a *lot* of money with SQL Server Express.  

I have a client that is actively looking to replace SQL Server with less costly alternatives.  I doubt they'll find a replacement that has a better TCO after factoring in the massive SQL code rewriting that will need to happen. But my next few posts will be my experiences with various NoSQL solutions.  I've been working on this project for months and I've learned (and re-learned) a lot of great things that I'm excited to share.  


Aaron Bertrand has a really good blog post called "Tip on using caution with MERGE" where he lists all of the bugs and weirdnesses with the MERGE statement.  It's a great post and I'm going to bookmark it because every time I use the MERGE statement I try to remember all of the little "issues" with it.  This blog post will be a checklist summary of issues that I can use next time I use the MERGE statement.  Aaron concludes that it is a productivity booster, but you need to be careful.  Good advice.  

Here are the isssues:

  • Always use HOLDLOCK [MERGE blah WITH (HOLDLOCK) AS targ].  Internally the UPSERT is susceptible to race conditions without it.  I thought I wrote about this in the past, but I guess I didn't.  I always add HOLDLOCK to my MERGE statements.  
  • There are tons of connect bugs around MERGE.  Most of them, I think, are spurious...I've never seen the problems, but YMMV.  Best to be familiar with them.  
  • Be careful with TRIGGERs.  @@ROWCOUNT is not reliable.  Aaron has a good repro script.  It's probably a minor issue, but the lesson learned is to test your triggers.  


Final VMS Lessons

Life Lesson Number 9:  Always confess to your VAXidents!
If you were told by your boss not to put your coffee on your terminal, don't do it.  But if you do, and it spills, and you fry your terminal, be a man and admit it.  (See Life Lession Number 5 in a previous post).  
My college was always getting used computing equipment donations.  Our biggest donor was Hershey Foods.  In 1995 their H.B. Reese subsidiary (they make the Reese's Peanut Butter Cups) replaced every VT100 terminal with brand new VT510's.  Of course, at the time, the 510, which was a disk-less terminal, was actually more expensive then even an outdated IBM PS/2.  Reese made us a simple deal...these terminals were already about 15 years old...we would drive up, test each terminal and load up those that we wanted on a UHaul and help them toss the ones we didn't want into a dumpster.  
I pleaded with the VP of Computing to walk away from this deal.  By the time we drove there, fired up each terminal, tested it, loaded up the good ones, drive them back in a UHaul with a rigid suspension...we'd be lucky to have one or two working terminals.  Furthermore, these were antiquated technology.  But the VP thought that a few spare VT's in each dormitory would be good for students to check email.  
VT05 (introduced 1970),V100 (1978),VT220 (1983),VT510 (1993...the last year for the VT series of terminals)  
I drove the 15 minutes to Reese in the UHaul and was presented with a palate of VT100's.  I counted 220.  Just kinda thrown on a pile.  I spent the remainder of the day, and the following day, testing each terminal.  There were 25 working terminals.  So Reese got some free labor because I now had to toss the dead terminals into the dumpster.  Each VT100 weighs about, I'd say, 65 pounds.  And they are big and bulky.  
So now I had to carefully load 25 VT100's into a UHaul and hope they would still work after making the trip back to school.  They were terribly dusty and I couldn't stop sneezing.  One of their techs suggested we carefully pack them on the palate, wrap them in bubble wrap and shrink wrap and then carefully place them in the UHaul with a forklift.  I watched as they handled this for me...the least they could do since I carried the other 90% of the terminals to the dumpster already.  Once loaded I carefully drove the UHaul back to school, avoiding as many central PA potholes as possible.  
Now I had to unload them.  The back of a UHaul in August is HOT.  So I found a forklift at school and decided that I could probably figure out how to unload the pallet all by myself.  
Bad idea.  
As you can imagine the net result was a pallet of old computing dinosaurs turned upside down on a loading dock.  It was an accident (or VAXident) but I again had an unhappy VP.  We managed to salvage 4 working VT100's.  The remainder went in a dumpster.  
That I loaded.  
Life Less Number 10:  So, how do you shutdown VMS?
--milk your customers with outrageous licensing fees while you outsource product development for at least the last decade
--you tell everyone to migrate to HPUX and buy new hardware and licensing.  Then watch will everyone gets wise and just migrates to Solaris or Linux.  
--complain that your product is losing you money due to structural shifts in the marketplace.  
--discontinue the product without open sourcing it.  
Oh, I'm sorry...did you want to know the "command" to shutdown VMS?  That's easy
Kinda funny command isn't it?  You shutdown something with a dollar sign in it.  That's HP for you lately...they shutdown any division that makes them dollars.  And OpenVMS has a lot of big contracts that HP really should be milking.  
I few posts back I told the story about how I did not know how to shutdown a VMS system.  So the question remains, how do you shutdown VMS?  
  • milk your customers with outrageous licensing fees while you outsource product development for at least the last decade
  • tell everyone to migrate to HPUX and buy new hardware and licensing.  Then watch while every customer gets wise and just migrates to Solaris or Linux instead.  
  • complain that your product is losing you money due to structural shifts in the marketplace.  
  • discontinue the product without open-sourcing it.  This dooms a viable product and screws your best, most loyal customers.  
Oh, I'm sorry...did you want to know the "command" to shutdown VMS?  That's easy
Kinda funny command isn't it?  
You shutdown something with a dollar sign in it.  That's HP for you lately...they shutdown any division that makes them dollars.  And OpenVMS has a lot of big contracts that HP really should be milking.  

More Life Lessons from VMS

Life Lesson Number 3: Data Management, and SQL, is here to stay
One of the first tasks as the new computer operator was to help write queries for Oracle RDB that ran on the VAX.  This was my first exposure to SQL, database management, etc.  The interesting thing is that OpenVMS has a database manager directly built into the OS called RMS.  You could "query" your OS just like your data.  Almost like DMVs in SQL Server today...but we had this in the early 1980's!  
I learned "SQL" but didn't even know it was SQL.  I was the "go-to" guy whenever someone needed an ad-hoc query.  "I need a list of adjunct professors who did not return their building keys after the semester."  No problem, I can do that.  "I need to know which alumni have not given to the pledge drive this year."  No problem, give me a day.  I was writing so many ad-hoc queries that I wrote a little macro in WordPerfect for VMS (yes, there was such a thing) that would help a user schedule queries for off-hours.  
I really did not understand how valuable I had become.  I was NOT a CompSci major.  For some reason I had this notion that I would make a good high school history teacher.  
During my junior year I applied for a job working for Electronic Data Systems.  At the time the only "toll-free" numbers started with 1-800.  Suddenly the RespOrg system, which is responsible for toll-free numbers, was running out of available numbers.  I was recommended for the job based on my experience with VMS and Oracle RDB.  The job was to extend the system to support 1-888 numbers.  Two of the first 888 numbers in existence were registered in the RespOrg system to me.  1-888-WENTZEL (1-888-936-8935) and 1-888-ASSHOLE.  I essentially "squatted" those two numbers.  The former I held on to until 2001 when I was tired of paying the yearly registration fee.  The latter I lost because I forgot to pay the yearly fee.  The RespOrg system, at the time, ran much like the domain name registration system does today.  I later found out Howard Stern had purchased the rights to 877-ASSHOLE, reportedly for many hundreds of thousands of dollars.  
Oh well.  
The entire RespOrg project was solely an issue of data management.  How do we take a hierarchical database and extend it for flexibility?  But I *still* didn't realize I was doing data management and "SQL".  
My first real job out of college was as a help desk tech.  My cube-mate was using TeraData and was having problems with her query.  I leaned over and helped her with her filtering and she asked if I knew "SQL".  I wasn't really sure, "but I know how to do THAT."  It turned out that THAT was SQL.  
My value was soon recognized and I was promoted into a better job, writing queries.   
SQL and data management isn't going anywhere, regardless of the NoSQL movement (if it is a "movement", is it "religious" or "bowel"?).  
Life Lesson Number 4: Rebooting rarely fixes anything on a server.  

Somehow I keep digressing from VAX/VMS.  
The VAX 11/780 that we had in college never required rebooting.  I was "in charge" of this beast and I religiously patched the software whenever we got a "kit" in the mail from DEC.  But it never needed to be rebooted.  
One time my junior year we were hit with an internal email-bomb.  We allowed people to write basic email macros that let them to do things like auto-reply to emails and set up "distribution lists".  We didn't put a lot of security around this so eventually a CompSci major (really, a "script kiddie") got his kicks by writing an email-bomb that put a SEND command in an infinite loop using auto-reply.  I was contacted when people started complaining that their emails were being delayed-on-send by a couple of HOURS.  
It probably took me 20 minutes to realize the problem was the macro and then disable it.  I expected response time to improve, but it didn't.  The VAX was crawling along trying to empty its mail queue.  I didn't think it was a big deal, I assumed it would eventually clear its messages...which were all failing delivery anyway.  It wasn't long before the VP came along and asked me why I didn't reboot it yet.  
"Um, you can reboot a VAX?"  I had never done that.  I rebooted my IBM PS/1 daily.  But I didn't know how to reboot something the size of a washer/dryer.  I wasn't really a "system operator" and had not prepared for that.  Remember, I was not a CS major and my formal VAX training involved RTFM.  I was just a quick-study liberal arts guy.  I knew how to restore VAX files from tape, but that's it.  I told the VP that I would reboot it and he left.  I never did reboot it.  I let the queues clear naturally.  VMS was such a stable OS that there are stories of VAX systems being up for decades between reboots.  Assuming you don't do something totally stupid, and you keep your system properly patched, follow good security principles, and do not over-engineer your software, rebooting should be a rarity.  
I still have co-workers today that believe it is necessary to reboot a SQL Server every 30 days.  Why?  "The error log gets too big to open up so it's just easier to reboot which gives me a new error log."  ...Or..."It's just good to clear out the old memory once in awhile."  Yeah, a cold cache is better than a warm cache.  BTW, these are enterprise-class servers.  
Life Lesson Number 5:  Never Lie in IT, there's always someone smarter, or with an ax to grind, that will make you look foolish
Eventually we migrated the 780 to a VAXstation later my junior year.  The "microVAX" was a "minicomputer" with the form factor of a desktop PC.  The 780 was the size of a refrigerator/washer/dryer combo.  
The process of migrating the data required one reboot of the 780.  Although I was highly trusted considering I wasn't a CS major, the VP wanted to shadow me.  When it came time to issue the reboot command I still didn't know how to do it.  Why didn't I research it last time?
VP:  "I thought you rebooted the VAX after the mail-bomb."
Me:  "Oh I did, I just can't remember what command I used."
VP:  "Do me a favor and type 'show sys' at the prompt."  
(example show sys output)
Damn.  I was caught.  You see, "show sys" will tell you the last time a VMS system was rebooted.  I slowly typed the command.  The response indicated the last reboot was 7 years ago.  I'm not kidding on that either.  7 years.  
The VP scowled.  
Me:  "I guess the reboot didn't 'take'."
VP:  "Don't lie to me, you didn't reboot, did you?"
It was at this moment when I learned that "lying" and "professionalism" are mutually exclusive.  Yes, I 'fessed up.  But did I really "learn" my lesson?  

Lessons Learned from VMS

A couple more funny VMS stories...
Life Lesson Number 6:  Early morning Vaxercise is good for your health

Some Life Lessons from VAX/VMS

HP announced back in June that end-of-support for OpenVMS will be 2020.  I've been compiling this blog post since then.  I have very fond memories of VAX/VMS... I used one to store my pillow for two years.  Funny story.  I learned a lot of life lessons from VMS.  Originally this was one big 'ol blog post but I decided to break it up so the post wasn't so long.  These posts are just a collection of funny VAX/VMS stories from my youth.    I really believe that I learned the most in my career from my few years working on VMS.  

Life Lesson Number 1: you need to have a sense of humor to work in this industry

Many people have speculated that HAL in 2001: A Space Odyssey was really supposed to be a play-on-words.  HAL comes one letter before IBM.  I guess that makes it "one better."  Mathematically, the equation is...IBM (IBM rot-1) = HAL.  The guy who wrote VMS was vehemently anti-IBM and when he ported VMS to WNT, which was the acronym for Windows NT before it was known as Windows NT, it is believed that WNT was "the next evolution" of VMS.  Mathematically...VMS(rot 1) = WNT.  That is a great sense of humor.
Life Lesson Number 2: Hard Work Pays Off
When I was in college I had a work/study job as a computer lab technician.  Basically a lab tech is supposed to sit at the front of the "lab" and help people print documents, log in to e-mail, etc.  This was a few years before computers were ubiquitous in every dorm room.  I was the only lab tech that actually walked around and helped kids and didn't just sit there playing games.  I'm serious...  It was so bad that I was promoted to head lab technician by Thanksgiving of my freshman year.  
I was promoted because I showed initiative and because I came up with snappy little "tricks" to make computers easier for the less

 savvy.  An example...our campus email was housed on a big VAX/VMS system.  We had to either login from a VT100 terminal or use a program called Kermit to access our email on old DOS 286's.  Kermit was actually a very popular communications protocol.  It is a backronym for "KL10 ErrorFree Reciprocal Microprocessor Interchange Over TTY lines."
I digress.  To say the least VT100's and Kermit were not user-friendly.  Students would send email and it would immediately bounce if the address was not formatted perfectly.  The syntax was "smtp%'email@address.com'".  And, yes, those are single quotes embedded in double-quotes.  In those days it was not well known that smtp meant "simple mail transfer protocol".  Lots of kiddies would type "stmp", "smpt", etc.  The error message was beyond cryptic when the format was not correct.  This led to lots of frustration.  I always explained that it was very simple, "send mail to postman."  People like when the difficult is distilled down to the simple.  I eventually wrote a VMS "macro" to make the syntax even easier for people.  
Just before Christmas break the VP of Information Services asked me if I would be interested in being the "computer operator" for the department, again, because I showed initiative.  The job was simple, take the backup tapes off-site every morning, check the status of scheduled jobs, answer emails for special requests, etc.  
I now had an easier job, could make my own hours, and was accountable to almost no one.  As long as the job got done, nobody bothered me. 
Hard work does pay off.  
Life Lesson Number 3:  When you fail, fail big.  (But not too big).  Or, your first intuition is probably your best intuition. 
You always want your failures to be big enough that they get you promoted because you solved a big problem, but not so big that you get fired for them.  

I had a part-time job while I was in college as the computer operator for a VMS system for a large, formerly monopolistic, telecom company.  This was way back before email was ubiquitous.  VMS has two e-mail programs…PINE and MAIL.  Neither has the concept of a “sent items” folder.  PINE is the “preferred” email program everyone used.  To set up a “sent items” folder you did this:

customized-hdrs=fcc: SENTITEMS

…where fcc stands for “folder carbon copy”.  It makes a copy instead of sending a copy.  Cool right?  In VMS, folders were not allowed to have spaces, much like the old DOS.  So if you typed this:

customized-hdrs=fcc: SENT ITEMS

…or this…

customized-hdrs=fcc: SENT,ITEMS

…you got the same thing, two copies of your email…one in a folder called SENT and another in a folder called ITEMS.  With me?   A comma and a space were the same thing. 

In Outlook you have the “Outbox” which is the temporary holding area where your mail sits until you can connect to Exchange.   VMS had the same thing but it was called OUTMAIL.  So, I really wanted to have my “Sent Items” called “OUTMAILCOPY”.  But I screwed up and typed this:

customized-hdrs=fcc: OUTMAIL COPY

Notice the problem?  Every email was copied to two folders…one of course being OUTMAIL, and that caused an infinite loop and caused email to go down for a few hours.  

So I crashed the email system by building my own little "mail bomb" macro.  I managed to fix it myself but by then people were paged and wondering what was going on.  I tried to hide what I did, "Don't worry it's fixed now."  

And surprisingly, that worked.  "Really?  You fixed mail all by yourself?  That's great Dave, we should promote you."  

This is where hubris of the young takes over, "Oh, it was no big deal really.  I know what I'm doing."   But of course I didn't know what I was doing.  At this point my failure was big but I had the opportunity to be promoted.  The next morning management began investigating the logs and saw that clearly it was my mail account that caused all of the problems.  Management's tone went from "let's promote Dave" to "let's fire Dave".  

I was spared.  But I was told I could no longer use PINE for email.  

Sidenote:  In VMS MAIL you use this command for “Sent Items”:


…which cc’s you on every mail you send, right to NEWMAIL (Inbox).  

More funny stories to come...


Metrics are a funny thing.  You hear people spout out metrics all the time.  "My Page Life Expectencies are 200."  "My disk latencies are 20 ms."  Just regurgitating the given metric says nothing about context.  There is no good guidance for PLEs or disk latencies and if you ask ten knowledgeable data professionals you'll likely get ten different answers.  But put all of the opinions together and you'll get a good "feel" for what might be good and what might be bad.  My general rule of thumb, I call it the "smell test", is if I don't have a benchmarked baseline for a metric then I look at "reasonable guidance".  If my metric is close then I look at some other aspect of the system to tune.  If my metric is wildly out of the range of reasonable guidance, then that is worth investigating further.  

Disk metrics are really tough.  There is so much abstraction going on with a system hooked up to a SAN that any given metric could be normal even if it falls wildly outside of the norm.  Latencies are one such metric.  About a year ago I pinned down an MS SQL Server field engineer who stated what the then current latencies guidance was (10 ms).  I blogged about that in my post Latency and IOPs.  I had a harder time extracting that guidance than that Microsoft employee's dentist had extracting his abcessed tooth.  

At the time I was told that IOPS of 100 Page Reads/sec was the guidance for "high".  We were at the time averaging 400 with sustained peaks over 4000.  This guidance made sense to me at the time because I knew our database code was horrendous (too much tempdb bucketing in procedures, lousy index decisions, procedures that returned 64MB result sets, lots of PAGEIOLATCH_SH waits, etc) and I was working on fixing the most egregious things.  So, using "Dave's Smell Test" I knew that our IOPS were probably outside of the norm and warranted further investigation.  

Now, I've always thought IOPS was a kinda nebulous term.  What is the definition of an IO?  Is it a 4KB block, or a 64KB block, which is of course what SQL Server cares about.  Depending on what the definition is your IOPS number could be off by a factor of 16.  Clearly it is in the interests of the vendor to spout off the highest IOPS number possible in their marketing material...so do you think the published number is for 4KB or 64KB blocks?  Read the fine print.  

It just seems a bit too confusing, which is why I prefer my disk metrics to always be in terms of "latency", not IOPS.  Your IO subsystem's goal should be, IMHO, the least amount of latency as possible, with the highest IOPS.  This is why you'll often see DBAs ask for their HBAs to have higher Q depth settings than the factory defaults.  Lower (default) Q depths will sacrifice some latency for better IOPS.  That seems backwards to me on an OLTP system.  

And then of course you have to take into consideration your "effective IOPS".  This is a term I made up (I think anyway) and it takes into consideration whether you are using RAID and what RAID level.  Single disk drives have a raw IOPS number.  Using RAID 10 effectively halves your raw IOPS number.  Why?  An extra write is required for the mirror.  It's twice as bad with RAID 5 (RAID 5 Write Penalty).  For instance, SATA drives average about 150 IOPS.  If I had 6 of those I would have about 900 IOPS, but I need to halve that number to get the true 450 IOPS at RAID 10.  This is all theoretical and confusing which is why I tend to ignore IOPS.  

IOPS numbers tend to NEVER hit the published numbers, especially on rotational media.  This is because database access tends to be somewhat random in nature.  Sequential transaction log access is the exception.  The more that disk head has to move, the worse the effective IOPS.  That's another reason why I firmly believe that ALTER INDEX...REORGANIZE is one of the best ways to speed up your effective IOPS.  

But even the latest SSDs will rarely give you IOPS numbers anywhere close to what they publish.  You'll likely saturate the PCI-E bus first.  

So I find myself not really relying on published IOPS or even "reasonable guidance".  There is no substitute for measuring your disk subsystem yourself.  Even so, once you have a good measurement, is there anything you can really do about it if it isn't up to the smell test?  Other than changing HBA QDs or using local SSDs for tempdb, I haven't had much luck in my career getting IO subsystems upgraded, even when they were woefully under-spec'd/misconfigured given my measured latency and IOPS.  The only thing I can ever really do is to lessen my reliance on IO.  And that of course means good indexing, good data hygiene, good caching, adding RAM, and tweaking IO intensive queries.  

CONNECT BUG: When SET IMPLICIT_TRANSACTIONS ON not every SELECT will start a transaction

I submitted the following BUG to connect due to some anomalies with SQL Server implicit transactions.  When SET IMPLICIT_TRANSACTIONS ON not every SELECT will start a transaction.  

We are having some serious issues debugging implicit transactions from a java/jdbc application. In the process of doing that we think we encountered a bug (possibly with just the BOL documentation...regardless). Under the topic of SET IMPLICIT_TRANSACTIONS BOL states: 

And here is how to reproduce the issue:  

Here is the actual repro script so you can try it yourself.  


Is a billionaire richer in Bucharest or Philly?

I was just code-reviewing a Romanian's code.   We got into a bit of an argument when we were looking at the query plan and saw that some operation was running this many times (sorry, no screenshot):


In the US (I think in most regions of India too), to make that easier to read we would write it as


…and most of us probably know that in most countries the decimal separator is actually the comma.  So in Romania the above number would be written as


...I think we all knew that.  But the interesting thing in the argument is how we would all “say” the number:

Me:  “about 2 trillion”

Alin (when speaking English):  “about 2 billion”

UK (in 1996):  “about 2 billion”

UK (present):  “about 2 trillion”

India:  (not sure)

This always confuses me when speaking to someone from Europe because they (mostly) use Long Scale notation and US (UK too for like 15 years now) uses Short Scale notation. 

Long Scale:  every term over a million = a million times the previous term (billion means a million million)

Short Scale:  every term over a million = a thousand times the previous term (billion means a thousand million). 

So for us short scalers…how would a Romanian say this number (in English of course):


Short scalers say “one billion”  and I *think* most long scalers would say “one thousand millions” ( or maybe “milliard” like in France).

The numbers are still the same it’s just how we say them. 

Side note:  I can guarantee you that Zimbabwe uses the Short Scale system.  The largest denomination currency bill ever produced was the Zimbabwe One Hundred Trillion Dollar note a few years back. They had a bit of a problem with hyperinflation...that note is now worth about 5 cents (actually about five bucks on eBay due to its historical and novelty value...I own a few).  Note the number of ZEROS.  Definitely short scale.  


So how do you make this less ambiguous when we are talking?  No clue. Use power notation?  Stop talking to each other and always work from home?

I find this fascinating.  It's a shame that something that is so scientific and should be a priori knowledge is in fact confusing and cumbersome when we try to communicate it verbally.  So, a billionaire is richer in Bucharest.  I guess.