In a past few weeks I have been working on a middle sized flash project. While its nature is in multi user communication and it was crucial to do that with low latency, our first choice was of course Flash Media Server. I spent few days by programming server side (fms) and few days on optimizing its performance and debugging and finally it was fast enough to get staged live. Server side was just about calling client methods (client -> fms -> client) and some SharedObjects, but no video nor audio streaming was used. My company dedicated super fast server (4 x quadcore Intel(R) Xeon(R) CPU E7440 @ 2.40GHz, 32 GB ram) just for Flash Media Server so we were able to run it on 16 cores. We installed fms v.3.0.1 r123 with valid unlimited licence on it. And went live.
First, we tried default fms configuration and soon after start master.log (and all other logs) started to grow up. The main content of the log was:
2009-10-07 00:03:31 24962 (i)2581223 Core (8684) is no longer active. - 2009-10-07 00:03:31 24962 (w)2581256 Core (8684) _defaultRoot_:_defaultVHost_:::_1 experienced 1 failure[s]! - 2009-10-07 00:03:31 24962 (i)2581221 Core (10772) started, arguments : -adaptor "_defaultRoot_" -vhost "_defaultVHost_" -app -inst -tag "_1" -conf "./conf/Server.xml" -name "_defaultRoot_:_defaultVHost_:::_1". -
This caused our client app to be unconnected for a few seconds but connected automaticaly back. So no big deal, right? … Well, after few days server started to refuse any connection into running instances (even with 1-2 clients in it). So there was no other possibility than to restart whole fms. We started thinking this is not good. So we taked action…
Well, what uncle google know about this? 3 000 000 results? Results taked me to adobe forums and I found there was dozens of topics with same problems. But no working solution for noone.
First, we tried to configure and reconfigure everything here and there, from default 3 cores to 1 than to 16 cores, disconnectiong idle clients etc. … but still the same. Master.log was growing lineary as more and more clients got connected at the same time (we are not talking about huge loads, just 10-20 simultaneous connections). Core crashes logged every 5 minutes, and after few days server refusing all client connections.
Okay lets handle this by watch dog. We created java watchdog based on testing connection into instance, and fmscheck watchdog to check all running instances periodically every 10 minutes just to know when server is down (refusing connections). Soon we discovered, java watchdog reported correctly and based on its output we were able to restart fms in case of crash. But this was not the solution.
The next step was to contact support. I created and posted article to adobe forum, but noone has replied yet. As an Adobe client we sent e-mail to Adobe support but no reply. Finally our local solution manager for adobe replied that we should try to upgrade to v.3.0.4. We did, but nothing changed. We even tried to run it on v.3.5.x but still the same results.
Our administrators even tried to trace core crash on system level “strace -p PID” but when attached on process, core never crashed. Solution? Not at all, as I was told tracing is slowing down system. There is one more way to trace crash (Core crash on linux) but we never gave it a try based on all previous failures…
Few weeks after we launched our project we were still not able to make it run smoothly, core crashed here and there and our client was not happy of course…
We were forced to change socket server and as an alternative we have chosen Red5. While it is written in java, we were able to rebuild and debug server side in a few days. And now our project runs great. Our solution is open source flash server Red5.
I am really disappointed about fms because:
- fms is crashy and broken, log files does not contain important informations
- even if we paid big money for licence we did not got any support that would help us
- I am not mentioning browser crashes caused by corrupted SharedObject
It is great we have all these new features with every new version of Flash Media Server, but as long as it is absolutely unstable, I do not know whether I will ever count on Flash Media Server anymore.
update Mar 10, 2010:
We have seen similar issues and the best work around is to have your clients connect on alternate cores/servers. The way to accomplish this really depends on your type of application. (Adobe Forums)