Topic: Parallelized Log Processing and the Effect of Data Center Packet Loss on Storage Performance
Speaker: Jon Meek
Date: Thursday, March 6, 2014
Time: 7:00pm (social), 7:30pm (discussion)
Pizza and Soda being brought to you by: INetU
If you are planning on coming please RSVP (http://www.lopsanj.org/rsvp) so we have a good count for the pizza and drinks.
Lawrence Headquarters Branch of the Mercer County Library
2751 Brunswick Pike
See map: Google Maps (New link with better coordinates)
I process log files for about 300 million proxy events per day. The processing involves three production passes, and Ad hoc data extracts for performance measurements, application debugging, security, etc. Multiple cores, from two to 60, are used to process the data in a reasonable amount of time. A pair of cores is used for each file being processed, one for decompression and the other for the processing.
My new “poor man’s map reduce” program for log data extraction will be described, along with the story of my disappointment with the performance improvement until I discovered that performance was much better when the data were supplied by another NAS system. The value of collecting long term performance data will be demonstrated.
Since William is looking for a new computer bag, I’ll bring a few of my favorites. Bring your favorite bags and we’ll have a show and tell!
“LOPSA-NJ is an organization for system administrators in New Jersey formed to facilitate information exchange pertaining to the field of system administration. LOPSA-NJ is not affiliated with a particular hardware or software vendor or company. Everyone is invited!”