New Server, Optimizations
So we got a new server at work to run my ETL program on. It's a HP blade with two duel core AMD Opterons and 10GB of ram. We installed Windows Server 2008 on it and a version of Visual Studio Team Suite so now I have some useful profiling tools. I found out my program spends 10% of its time on the ToLower() string command. Our CRM system is weird with strings and *sometimes* returns attribute types with capitals and *sometimes* does gives them all in lowercase depending on where you get them from. The problem ended up being in this function, which was a pretty big bottleneck:
public static string getAttributeType(string entity, string attribute)
{
foreach (CRMAttribute attrib in attributes)
{
if (attrib.entityName.ToLower() == entity.ToLower()
&& attrib.attribute.ToLower() == attribute.ToLower())
{
return attrib.type;
}
}
}
I hadn't realized it would be called so much, so I replaced it with a hash lookup and now it goes much faster. I always liked how profiling tools show you such unexpected results when you get into a project of this magnitude.
The other bottleneck is threads waiting on other threads... I'm not sure what I can do about this but I'm going to be investigating. There isn't too much time I have to be working on optimizations because the majority of the program is it spending all of its time publishing to the CRM webservice which we can't speed up.
