Oracle’s Jump Into the Big Data Realm
Many of you may have seen that Oracle officially announced their new Big Data product offerings today. Included in that list is the Apache distribution of Hadoop, Oracle Loader for Hadoop, Oracle Data Integrator Adapter for Hadoop, the Oracle NoSQL Database, and Oracle's R Enterprise. On the Oracle big data pages it seems like there is some confusion as to whether R will run a custom Oracle version or an Open Source version. Knowing Oracle it may be both. They have been working on a version of R within the Oracle database and might be augmenting it with an open source version on the Hadoop appliance. Lets cover what is known so far about the different components.
Apache Distribution of Hadoop
I'm actually really surprised here. Larry Ellison has acquired a Mike Olson company previously (Sleepy Cat) and Mike now runs Cloudera. I would have bet good money that a partnership would have been formed. My guess is that Oracle looked at the management tools Cloudera provides and determined that it would be too hard to integrate into their Enterprise Manager product or that Cloudera's price was too high. They are leading the Hadoop support market right now and have a really good future. Hadoop comes with a very large open source ecosystem around it. This solution should be great for both Hadoop and Oracle.
Oracle Loader for Hadoop
From what I have heard this is a map reduce job that will format the resulting data set file into an Oracle Data Pump file to be loaded directly into the database. There has been the tools SQOOP to suck data out of the database, but now we have the other side of the coin. Hadoop is great, but joins are still a bit problematic and the BI tools around it don't match what can be done in the traditional database world. The loader should help companies figure out what the balance should be.
Oracle Data Integrator Adapter for Hadoop
The Data Integrator Adapter is similar to the Oracle Loader for Hadoop, but it extends Oracle's Data Integrator product to be able to execute and manage Hadoop jobs as part of an ETL process. It is well known that Hadoop can crunch and count numbers faster than the Oracle Database in many cases. This allows the ETL process to offload the heavy number crunching and then use the Loader to put the data into the Oracle database when complete.
Oracle NoSQL Database
For a long time it has seemed like Oracle was neglecting the Berkeley DB product and not making giant leaps forward. Berkeley has always been a fantastic product for key value stores. In fact many of the major key value stores today are underpinned with Berkeley. It looks like Oracle has updated the product to bring many of the missing distributed features into the new product. It will be interesting to see how the new Times Ten Database, NoSQL Database, Hadoop, and ExaData components work together in the Oracle BI tools over time.
Oracle R Enterprise
R has long been the language of choice for the statistics community. It isn't clear if Oracle will be using the open source R-project.org distribution or has released their own. My guess is both. R-project will be deployed with the Big Data appliance and Oracle's R within the database. This should make a bunch of the big data number crunchers from the SAS world happy.
I have been told that Larry and TK have given the green light to go full force into NoSQL. If that isn't justification that it's a "real thing" that is here to stay I don't know what is. Oracle has and will continue to invest significant resources into Big Data.
This makes the team here at UberEther happy to see the largest software vendor come to the table and support the Hadoop community. It relieves some of our worries in developing out new log aggregation and risk adjustable access control product to the market. We know our product will be able to run on predefined hardware platforms for our largest customers and we can easily load the data into their legacy tools to reuse their existing investments. We still have a lot of work to do but if you're interested in hearing some more while we're out here at Open World contact us and we'd be more than happy to show you what we've been working on.
Taking a Techcation, Want Some Ideas
So Jake over at the App Lab has been known to take many a staycation in his time. I'm notorious for never taking any of my vacation (been at my 180 hour cap for close to a year). So I'm going to start a new trend taking a "techcation." My goal is to take a week away from the office to play with a different technology every day and blogging my progress at the end of each day. My target week is that of March 13, so I have 2 weeks to prepare. There's been a bunch of different things on my desk and in my mind that I've been meaning to find time to play with, but as with everyone, works been taking all my time lately. So I'm going to stay home and play with something new every day. So far here is my list of things I'm thinking about:
- Arduino something (Thanks Chris)
- Hadoop 101
- Cassandra 101
- MongoDB 101
- Android 101
This is a very tentative list. Anything else people think I should be looking at? I'm looking for something that I can get a basic knowledge and have a working "something" up and running within 8 hours. Feel free to hit the comments with suggestions. As much as I love the identity world I'm hoping to stay far far away for the week. I'd love to see this idea take off, even better would be if people started pairing up or doing it in small groups.
I might drop one of the days to work on the 280Z or my truck if its nice. I've had two pairs of speakers sitting on the floor of my office to go into each vehicle for about 6 months now. I've also got a bedslide for the truck thats been taking up garage space for over a year. I'm sure my wife would appreciate that one.
Oracle IRM and LDAP Accounts
One of the great features of Oracle IRM 11g is being able to automatically link your users from LDAP. This way you don't have to manage the user's in two places or write any custom synchronization code between them. The LDAP integration is done through the providers WebLogic. A word of warning about this integration is that you need to have your LDAP provider setup in WebLogic before logging into IRM for the first time. You also need to have your LDAP provider as the first item in the list before logging in.
My reason for typing this out is because it burned me for a few hours trying to figure out what was going on. The first user that logs into IRM is set to be the administrator. This person creates the contexts, roles, etc. and assigns all the privileges. For most implementations people normally use the built in weblogic user that is created during installation. This is where I went terribly wrong. IRM binds the GUID of this user to the IRM database repository. This is obviously much stronger than binding just the username or the DN of the user but also can cause crinkled skulls when trying to debug.
So, I logged into the server as weblogic, got the tabs and pages I expected so I figured I would setup my LDAP provider. I went into WebLogic and created the provider. In a development environment I normally set the internal provider first and then the LDAP provider second. Even though I take a hit in performance, in a development environment I prevent myself from being locked out of the server. I was now able to authenticate but IRM wasn't letting me into the interface. After talking to the dev team they told me the LDAP provider had to be the first one in the list as thats all they look at when authenticating the user. No problem pop it up to the top of the list.
Now I can authenticate into IRM and get into the irm_rights pages, but I don't have any of the other tabs to manage the server. The GUID that my weblogic user is tied to is now second on the providers list and the GUID of the weblogic user in my LDAP server doesn't match the local GUID. Shit, so now my administrator user can't be reached because he's second in the provider list and I can't set any other administrators because if I move that provider back to the top of the list the LDAP users don't appear since IRM only looks at the first one.
Lessons learned and a reinstall to fix.
Installing Firefox 4 Beta on OSX
So today I decided to switch over to the mainline Firefox 4.0 Beta from the Minefield dailies I've been using. I started to copy the new Firefox.app directory over the existing one in my Applications folder and immediately was met with a:
'The operation can't be completed because the item "libsmime3.dylib" is in use.'
Well shit, whats holding onto that lib. Turns out that it was my Cisco VPN Anywhere Agent, aka vpnagentd. So I go ahead and do a:
'sudo killall vpnagentd'
Ugh and of course it restarts automatically before I can copy the files over. So what now. Oh yeah, it runs as a daemon so I need to use my old friend launchctl to unload it. The command for this is:
sudo launchctl unload /Library/LaunchDaemons/com.cisco.anyconnect.vpnagentd.plist
w00t! Now the files copy over with no problem. Now I need to put AnyConnect back into place. This can be done with:
sudo launchctl load /Library/LaunchDaemons/com.cisco.anyconnect.vpnagentd.plist
That should do it, Firefox 4 Beta up online and working again.
That was 15 minutes of my life I'll never get back. Hopefully this post saves you 10.
My Notes from Tonights Exalogic Release
As I'm sure all of you have seen Larry finally announced our new Exalogic box tonight. An amazing piece of hardware and software married together. I'll probably be posting more about this over the next few weeks but here are my notes from the keynote below:
360 Cores
30 Servers
40 GBps Infiniband Backend
Direct connection to Exadata
Integrated Storage Appliance to house application software and files, patch the VM by downloading one file and it patches all the software on the device by placing it on the storage appliance
2 Guest OSes: Solaris and Linux
Coherence for memory synchronization, illusion of one unified memory system
Software optimized for the hardware
No single points of failure on the box at all
Duplexed fault tolerant storage
Can patch the hardware and software consistently across all their customers because its all engineered the same with standard configurations
Internet Apps: 12x improvement, over 1 million http requests per second, facebooks traffic in 2 full racks
Messaging Apps: 4.5x improvement, 1.8 million messages per second, all china rail ticketing in 1 rack
2.8 TB of DRAM
960 GB of solid state disk for persistence
4 TB Read Cache
40 TB SAS Disk Storage
72 GB Write Cache
1.2 microsecond latency
10 GB Ethernet to the Database
10x Network Latency reduction
Eliminated buffer copies
64K packets vs. standard 4K
3x throughput vs. 10 GbE
Parallelized Message Queues in WebLogic for Exalogic, multiplexed over infiniband
Dynamic load balancing to Exadata Database RAC nodes with transaction affinity to appropriate RAC nodes to maximize locality
SQLNet over Inifiniband (SDP) to maximize JDBC performance
Instant failover in case of node failure due to Infinibands underlying protocol
Cache Coherence
Instance state replication across nodes
Near instant access to data from other compute nodes
Lossless infiniband network enables instantaneous state failure detection and failover
Solid State Disk eliminates Java VM heap limitations
Reduces GC pause times and increases cache capacity
1/4 of a rack to 8 full racks in a single cloud
Runs all applications, not just WebLogic
Built in application isolation and security with weblogic domains
Built in network isolation and security with infiniband partitions and virtual lanes
Build for elastic capacity on demand
Maintain a balanced system as your size and number of racks grow
Standardized and easy to manage
Exadata machines phone home when there are problems, unified management and monitoring interfaces
Single patch for exalogic systems
all customers run the same configuration
all software components can be patched together
all patches are built, packaged, and tested together
enterprise manager automates patch and upgrade procedures
IBMs best vs Exalogic
Exalogic => $1,075,000, 40% more CPUs over IBM, horizontal scale out infrastructure, inifiniband fabric, fully fault tolerant
Power 795 => $4,440,000, Old SMP vertical scale-up system, no fault tolerance
Elastic and Virtual Public and Private Clouds in a Box
Best Performance and Cost Performance in one
Exalogic VM and OS
Both Linux and Solaris as the guest OSes
Elastic Capacity on Demand to add and remove VMs
Fault Isolation to the virtual network level
2-4% CPU overhead for OracleVM virtualization
Instantaneous migration across VMs
Single Route I/O (SR-IOV) Virtualization
dedicated I/O bypassing hypervisor
low latency
50% better infiniband utilization
Dual Personality to Next Version of Linux
OEL -> 5000 customers
Compatible with RHEL
Never been a compatibility bug between RHEL and OEL
RH does not test releases with Oracle products
RedHat is slow to mainline community enhancements, their current kernel is nearly 4 years old
Announcing: The Unbreakable Enterprise Kernel for Linux
Fast, modern, reliable, and optimized for Oracle
Used by Exa-machines for extreme performance
Allows oracle to innovate without sacrificing compatibility
Oracle vs RedHat
5x on flash cache reads
2.4x on solid state disk access
3x Infiniband RDS messages (IOPS)
1.8x transactions per minute on 8 socket database OLTP
Bigger servers up to 4096 CPUs and 2TB of memory
Up to 4 PB of disks
Advanced NUMA support
APCI 4.0 support for power and cooling savings
Data integrity to make sure corrupt data cannot be written
Database data integrity enabled ASM drivers now
Hardware Fault Management built in
Diagnostic Tools with less overhead and performance for tracing
Upgrade by recompiling the kernel, no reinstall needed, OEL and RHEL customers both
Some more updates:
All of the middleware machines are disk less and boot from the storage server, this also centralizes all the logs
The storage server is a Sun Storage 7000 at its core
Most organizations will get over 240 managed servers per full rack
You obviously don't have to utilize an Exadata rack with Exalogic, but its certainly going to be better with the Infiniband backends
Each managed server should be assigned it own VIP for failover and load balancing as they move across the nodes that maps to a single application VIP
Any OEL5 or Solaris 11 application should be able to run on the machine out of the box, doesn't get the secret sauce of java performance, but will likely see gains due to the matched hardware.
Most customers will also need to license the ExaLogic Elastic Cloud software and WebLogic Suite for each compute node they purchase (per processor). Can reuse if they already have them.
BI Publisher is shipped to report on top of the audit logs generated from the server, it is recommended that the logs are loaded into an Oracle Database for easier reporting.
ExaLogic machines will be available in the ETC's throughout the country on the rotation program.
No word yet if you can combine a 1/2 rack of ExaLogic and a 1/2 rack of ExaData into a single physical rack when ordering.
Virtual Machine Beep Blowing Your Ears Out?
I'm known to have a set of headphones glued to my head while working. Sadly on a daily basis I find myself blowing my ears out when I hit a tab in VMWare and the PC speaker beep goes off. Luckily I found this command to disable that from happening ever again:
$ echo "blacklist pcspkr"|sudo tee -a /etc/modprobe.d/blacklist.conf
I've added that to all my VM templates so hopefully I'll save my hearing from now on. Pretty sad that this is my only blog post this year.
My DBA changed the name of my OIM database, what do I do?
I've been spending a lot of time working with Oracle Identity Manager (OIM) lately and I'm putting together some walkthroughs / tips and tricks on working with the product. I'll start with a quick and short one today. Sometimes DBA's get wild hairs, I think its in their nature, the just decide they want to rename a database and tell all the developers that they need to make updates to their code / jdbc data sources to reflect that. Inside an OIM deploy with OC4J there are two files you need to change to make this happen. They are:
- <OH>/j2ee/<oim container>/config/data-sources.xml (2 places)
- <OIM HOME>/xellerate/config/xlconfig.xml (1 place)
If you search for 'jdbc:oracle' in each of the files you should be able to find these references pretty quickly. These are also the places you need to modify things if you are migrating into a RAC environment or if your IP address / hostname / port change on the database.
Hopefully I'll have something with more meet to post this weekend.
Ora-Click-Clack Weekly Review #1
Well, part of starting Ora-Click I wanted to start a series of blogs covering the top 5 articles of the week. Due to some technical difficulties I'm a week late, but I promise to make it extra extra interesting. So without further ado here are the top 5.
- The big story of the week wasn't Jakes post on the "8 Things" but actually the User Your Nose to Install an Oracle Database article. Some people claim that Oracle software is among the hardest to install. I tend to agree many of the earlier releases (<8) were a pain in the butt, but Oracle's come a long way since then. Just to prove it Howard posted a link showing exactly how easy it is. I would love to link to it, but as I'll discuss later Howard has shut down his blog until the "8 Things" craze slows down. When he gets back to earth the article can be found here.
- The second article covers the new fixes and features of the latest mix.oracle.com release. I'm proud to be member 41 on the site and really think its a great way to get open and honest customer feedback. As the site grows there's definitely some new features that need to be added to help manage consuming all the new content but knowing the AppsLab guys they're already working on it. I would love to see Oracle open up the source to mix, as suggested by Jake here. I know that Anthony and Rich are overloaded fixing bugs and working on new features, it would be great to have the Oracle
Oracle Open World 2007
Let the barrage of OpenWorld begin. Between two internal preparation meeting tomorrow, finalizing my slides tonight, and other various activities getting ready, the Oracle World barrage has begun. I'm sure many people have seen that I'm speaking this year, and for my first year presenting I'm going to be pretty busy. I have 4 official presentations and I'm hoping to get a fifth with the Unconference event going on. I'm going to try and spend a good chunk of my time in sessions. This is a really exciting year with all the 11g Application Server things being demoed and I can finally start talking about all the cool stuff I've been building in the beta period. If you want to meet up give me a ring during the week, my cell number is over in the right or feel free to come and heckle me at one of my sessions.
| Session Title | Speaker(s) | Date/Time | Venue/Room | |
| IOUG MiddleWare SIG Meeting: Is that really you? Prove it! | Matt Topper, IT ConvergenceDan Norris, Piocon |
|
Moscone West 2005L2 |
|
| IOUG Enterprise Best Practices SIG: Oracle Databse 11g Beta Testing Panel | Matt Topper, IT Convergence And Others |
|
Moscone West 2004L2 |
|
| IOUG: Oracle Identity Management--The Total Identity Solution | Matt Topper, IT Convergence |
|
Moscone West 3006L3 |
|
| IOUG: Demystifying Oracle Fusion Middleware | Matt Topper, IT Convergence |
|
Hilton Yosemite Room C |
I'm also planning on attending the Ace Dinner, AppsLab Get Meetup, and Oracle Blogger Meetup. I'm hoping to chronicle the debauchery of the week via all my feeds, here, on twitter, and through flickr. I can't wait to see everyone there!




