Over the past few years, I have talked with fellow Teamcenter administrators that were under pressure from both users and management to improve Teamcenter performance. I am also one of those pressured administrators. I’ve felt the heat from users and management for what we all though were problems specific to Teamcenter.
My experience has been a lot like other Teamcenter administrators. Users complain about performance but give little to no details to help you troubleshoot their problems. I peppered IT with questions about our network and servers to troubleshoot the performance problems and got little in return. There was only one person left that could help. That person was me.
I started researching and learning more about databases, networking, performance monitoring, servers, clients, and Teamcenter than I ever had before. I attended many presentations, read many articles and books, and dug through every piece of documentation on databases, networks, and Teamcenter that I could find. I had no idea how much I had to yet learn. It felt overwhelming at times, but I kept going.
What I learned really surprised me. Teamcenter wasn’t the real problem after all.
This is my story.
2007-2008 – Teamcenter Planning and Procurement of Server Hardware
After reading a lot of information (and attending many presentations at the annual PLM World Conference) on recommendations for Teamcenter hardware/software/tuning/etc, I requested the recommend server hardware/software for both database and Teamcenter middle tier servers. The database server was supposed to 64-bit with dedicated high speed RAID and lots of RAM. What I got back instead was something very different.
This was because IT only used 32-bit blade-type servers and attached them all to an old NAS shared by many other application/departments/users. I had to work within these IT limitations knowing they were less than optimal, but I also didn’t know how much less than optimal they were really going to be! I was a SQL Server novice at the time. I didn’t know what we needed for sure. I was dependent on IT to provide that kind of advice. They said it would be fine.
Our DB server had 32-bit SQL Server on a Windows 2003 (32-bit) blade-type server with 4G RAM and remote attached shared NAS storage for the DB files. This also served as our TC Volume and license server. Those that know Teamcenter (and DBs) well already know that this is a recipe for disaster. I didn’t (at the time). I was new to understanding the hardware hungry needs of production databases.
Summer/Fall 2009 - Initial Rollout
When we first rolled out Teamcenter to our users, initial performance was a little slow but tolerable. However, performance became increasingly worse as we migrated more and more data from our old I-DEAS TDM into it. What took 10 minutes to check out/in, now began to take 1.5-2 hours.
Suspecting the Teamcenter servers, I monitored/tuned/monitored/tuned them over and over with only incremental improvements. Something else was causing this slow down. After monitoring DB performance for while, I figured out that the DB server really needed RAM.
Fall 2009 – A Small Performance Boost?
We added 8G RAM for a total of 12G RAM on the database server, and we enabled AWE to allow SQL to access more than 4G RAM. Performance was immediately better but still bad. After doing even more research on SQL Server, I found out that SQL Server on a 32-bit OS with 12G RAM is still bottlenecked. This is because only certain SQL functions were allowed to access the extra memory beyond 4G. Most of it still had to use the limited 2-3 G RAM available. I still had a lot more work to do.
Summer 2010 – A Bigger Performance Boost
After monitoring all the various server resources and doing some more reading on databases, I determined that another bottleneck was Disk I/O for our attached shared NAS. Since our DB server was already RAM limited, it was even more dependent on the shared NAS. I would watch the NAS run at 100% disk usage for hours before slowing down, and since it was shared with other applications, it would run at 100% even without anyone doing anything in Teamcenter yet. When someone finally did try to use Teamcenter, chaos (and a lot of griping) ensued.
I went to IT about this and asked again for a dedicated disk array for the DB files. They didn’t think there was a problem with the NAS, so I had to monitor our NAS usage and create a report over a period of many weeks to show them that there was indeed an issue. This finally convinced them to move our DB files to a newer faster model NAS that was shared by fewer other applications than the old NAS. Not exactly what I wanted, but it did improve performance about 25-30%. However, there were still spikes of bad performance due to other applications/users concurrently accessing the new NAS.
Spring/Summer 2011 – The Final Solution?
After further documenting our performance issues with the database server, I was finally allowed to purchase a server similar to the server that I originally asked for three years earlier: 64-bit Windows 2008 OS, 64-bit SQL 2008, 64G RAM, and a dedicated 16-disk internal array of high speed disks configured as RAID 10 directly attached to the server.
After moving the DB to it, performance is now finally good. What used to take a few hours back in 2009, now only takes a few minutes. Finally, our Teamcenter database is working as expected. All that hard work finally paid off. But…
Fall 2011, Winter, Spring 2012 – Here We Go Again
After putting the DB bottleneck problem behind us, a new bottleneck arose: the WAN. Following Siemens PLM Teamcenter Development Consultant David Howe’s White Paper on Teamcenter Network Performance Tuning, I was able to make a number of server and client tweaks to improve network performance.
However, I am now in a tennis match with IT regarding our network connection to our other locations. My SQL performance story repeats itself with a new main character. I ask for help with the WAN. They say that there’s nothing wrong with it. I ask for specific details of our network configuration including questions about our WAN accelerator. No one has the answers. Here we go again. Since this new topic is still ongoing, I’ll have to wait until it’s resolved. Then I’ll write an article about it later.
Hard Earned Lesson Summary
- Don’t take any one’s word for granted. Inquire, Debate, Test, Verify, Document. When they say, “There is nothing wrong with the <fill-in the blank>.”, you’ll want to see the monitoring they did to verify it. Or, provide your own results that show otherwise.
- Have someone take ownership of getting the problem resolved. The problem I had is that no one in IT would take ownership of the issue. There were many buckets of knowledge but no one wanted to venture past their own realm to figure out what’s going on.
- Follow Siemens PLM (and Microsoft’s/Oracle’s) recommendations for DB server hardware. Use 64-bit OS. Use 64-bit DB. Get a lot of RAM. Get a lot of high speed disks. Make sure the disks are dedicate to the DB and not shared with other applications/users/departments.
I hope this article gives hope to others like me that are currently experiencing similar issues.