LGC
= Large Group Collaboration
Problem
When 6+ people are actively collaborating on a space (LGC), the number of broadcast events being received and processed per frame (especially painting, cursor position) is too much work for client computers, and stresses the kinopio-server which acts as a broadcast relay.
Goals
- LGC shouldn’t sink client performance
- LGC shouldn’t cause server outages
Context: How Collaboration Works Today in Kinopio
Each client (eg a kinopio web-browser window/session) opens a websocket connection to the server when it loads a space. When you performs an event on a client, that event is broadcast to the server. The server has a simple broadcast relay system that then sends that event to all other clients connected to the space.
The list of broadcast events includes:
- painting (sent per frame during painting)
- cursor position (send per frame while cursor is in viewport)
- card moving (sent per frame while card is being moved)
- card is being edited
- card is edited (eg name changed)
- new connection is being created
- connection is being edited (eg change color/show label)
In LGC, sending per frame broadcasts means a lot is being relayed and processed, up to 60fps per client. In a scenario with 6 clients where everyone is moving or painting, that’s potentially 6*60 = 360
broadcasts being received and processed every second. With 30 clients, that jumps to 1800
broadcasts per second. Yikes.
Other Approaches Elsewhere
Trello/Glitch
Trello’s client works like Kinopio’s, but Trello clients only broadcast low resolution events (e.g. instead of showing a card being dragged in real time, trello only shows/broadcasts the new card position after the drag is completed). This is fine for an enterprise kanban situation, but (default) low resolution collaboration is less fun and doesn’t fit Kinopio’s interaction model very well.
Figma/Mural
Probably the closest parallel. I’m not really a figma. user so I have no insight into how it holds up to LGC IRL. They don’t have the technical burden of painting though which probably helps a lot (maybe that’s enough?).
I found https://www.figma.com/blog/how-figmas-multiplayer-technology-works , but it doesn’t explicitly mention about how LGC/high numbers of simultaneous updates are handled. Looks like Figma has a max 50 collaborator limit (https://forum.figma.com/t/are-there-any-collaborator-limits-per-project-file/1150/4)
Mural has a help doc that mentions that perf is expected to slow down during LGC: https://support.mural.co/en/articles/5444182-best-practices-for-better-performance-in-mural
Based on this, I don’t think these work any differently from Kinopio, the main diff is that they don’t have painting which is a relatively intense CPU/GPU operation.
Nodenogg.in
Gives the user configurable modes that lets you only see your cards, which allows the user to elect out of handling broadcast events. The technically easiest solution, but actively having to choose and switch modes for handling technical issues is not in the spirit of a toolbar/mode-less tool like Kinopio.
Fighting games/GGPO/rollback-networking
Largely irrelevant to Kinopio , but kinda neat to learn about. GGPO is mainly for situations that require realtime historical event synchronization and correction, and to situations where possible inputs are limited. Really smart idea that basically works by assuming that the input that will happen next is similar to the input that happened before. If I walked forward in the last frame I’ll probably walk forward in the next. In the case that the prediction is wrong, the game ‘rolls back’ to the correct sequence of events.
Because of its cleverness, the rollback system falls apart really fast with anything other than two clients (eg P1 vs P2).
Proposed Solution(s)
-
Possibly dynamically adjusting the resolution (timing) of when broadcasts are sent, based on number of clients.
-
Or, not broadcasting painting events during LGC, and showing a notification to inform the user that painting isn’t shown in LGC (easier to do, but less elegant). I’m not sure how effective just targeting painting alone will be irl yet, but it might be worth doing as an easier to implement experiment.
Other ideas?
(also to do:)
- Investigate possible memory leak issues in the server broadcast relay where old clients may not be getting cleaned up