klytus

New Members
  • Posts

    2
  • Joined

  • Last visited

klytus's Achievements

New User

New User (1/3)

  1. Roman, Thank you! I was able to get everything working thanks to the information you provided. Amazon do indeed block all multicast and broadcast packets on their system. The solution was to bind a public IP to every EC2 instance on boot (a checkbox in the launch screen) they all then connect to the tracker server, then find each other that way and transmit all data over the local subnet. The more nodes I spin up the more efficient the system seems to be, it's perfect so far. The solution to starting a unique BTSync device on each cloned machine is to start them with an empty config directory on boot (as you advised). I have a small script that launches BTSync with /CONFIG pointed to a config file that configures all the folders to sync, with the devicename field commented out so the sync device always equals the unique Amazon machine name for each instance. For some reason if I reboot an instance the only way BTSync comes back up properly is with the storage directory wiped clean again so my startup script takes care of that. I've been testing the system on a new job and it's performing well. The cloud farm nodes and my local machine's filesystem are seamless blended, and as our render nodes are now working off their own locally attached drive the thing works quite wonderfully efficiently.
  2. We run a suite of grid computing applications (sadly, all Windows only) in the EC2 cloud to produce computer graphic rendered content for film and television. Most recently we used our EC2 system to produce several shots for an upcoming episode of the new Cosmos series. In doing so we ran up against severe IO problems as 80 nodes attempted to access a 1.3GB particle cache file from the EC2 instance acting as a server, as it was a several thousand frame sequence, every machine needed to access this file dozens of times each. We solved the issue by burning the cache into the C: drive of a one-off Amazon Machine Image that was then spun up into 80 nodes to complete the work. This work-around is fine, but time consuming to setup so we decided to implement a Distributed File System between all the nodes using BTSync. BTSync works wonderfully between our studio and the cloud, we are in love with the software already. However once in the cloud, we run into the following problems: 1: We have found in EC2 that any application that searches the cloud subnet for other running instances of itself will not find any other nodes. Amazon don't allow certain kinds of packets across their network. BTSync works fine if each node is explicitly setup with the IPs and ports of the other nodes, but no matter how we configure the firewall rules on the EC2 side, searches of the local subnet don't work, the nodes can't find each other and nothing syncs. 2. Because every render node is spun up from the same master image, they all come up with the exact same BTsync device name which seems to cause problems. Because there is no command line interface in the Windows version of BTSync there is no automated way to set this. Manually editing the config files (which don't appear to be ASCII) always fails because it disrupts a checksum and the config file is renamed by the app to .bad. While there is probably no way around problem 1, for problem 2 the addition of a command line interface would be immensely useful to us because we could run a startup script to set the BTSync device name for each node to it's unique EC2 machine name, and also as part of the same script create sync shares with the appropriate secret with the IPs and ports of the other BTsync nodes on the local subnet. As an alternative to a command line interface, a build of Windows BTSync that uses plain-text config files would be hugely welcome, as we could set those up with scripts. As I said, we love the software, and these problems notwithstanding we see it could have great potential for us with large-scale cloud applications.