Hi guys,
I got a general architecture question, hope guys can help me. So some basic background first. I have a midsize customer base using my flagship product, which is mainly a crm accounting software. Its a windows based product and they host the data on their servers (usually at their offices).
I have a clickonce application that uploads a subset of their database to our servers. We store all our customer's data in one database. We give each customer and their subsequent records a datasetid.
The clickonce works using a manifest, which essentially figures out which records to add, delete or update on our servers. The transfer of data between the customer's server and our server is through a web service. After the inital upload (which is data intensive), the subsequent uploads only push the difference up (which is much smaller). A majority of our clients tend to upload once day.
We also have a web application that accesses this subset of data to display information to a variety of users, who are our customer's clients. We have about 3000 active users for this web application, which hit the website about once a week.
The issue lie in this, the queries to our database from the website, as well as from the upload service are timing out. Everytime we do a new release, we expand the subset of data (ie add a new table or 2). So immediately after the release we get bombarded with uploads. Hence our site is down alot recently. I think too much traffic is happening to the one database, in terms of data being uploaded and data retrieval for the website.
We would like to keep the subset of data as synchronised with flagship products data as much as possible.
Things we have aleady got include, Scheduler for the clickonce to upload data early in the morning. We are currently implementing a queueing system so only x amount of people can upload at once.
What are some avenues I can explore for the long term and the short term?
Cheer.
-
One thing I would make sure you look at before you start digging too deep into the optimization process is identifying your bottlenecks. Is it client-side processing, client bandwidth, server bandwidth, or server-side processing. Assuming you're not shipping massive amounts of data around, it's likely server-side processing, but making assumptions like that can get you into trouble on these kinds of things.
-
If the uploads don't have to be instantly available (can wait a little bit before you select the data) I'd strongly consider putting a queue in front of that using MSMQ / WCF. That can help you throttle your intake of that data.
Next it's important to analyze your database structure. There isn't too much information about your schema here, or number of rows, etc... but there are several things I can recommend if you have a lot of incoming data.
- Make sure that you aren't using a technology that isn't up to the task (like Access or Sqlite).
- Consider optimizing a database for Inserts, then replicate that database to other databases you might use for selects.
- If you have a tremendous amount of data, consider setting up a cluster
- Consider switching to a cloud based data service like Amazon SimpleDB or Azure Data Services
-
Where's your bottlneck? Is it a bandwidth issue with your data connection, an issue on your database server, or on your web server, etc?
What database software are you running? If it's access or something like that it's got to go, in favor or something slightly more enterprise like sql server or similar.
Are all the clients trying to upload at the same time? Can you stager them somehow, maybe by having them upload at a random time throughout the day, or a randome time during a specific window.
Would clustering your web server or db server help?
Can you implement some sort of queue through msmq as already mentioned, or use biztalk or a similar product. That way the client can submit their results, and forget about it, then the queue software handles the actual delivery.
0 comments:
Post a Comment