When we speak of scalability we expect or want to hear the words linear performance gains. To be honest though this is not the case most setups as their reliance on another server/service is too great and thus bottle-necks form up within the network you're trying to host for users.
As we explore options we hear things like Databases, Message Queues, and Brokers; These are fine to use but as mentioned above if reliance on any of them is far too great you will destroy your setup in sure time.
Design the WSS Server to act solo (unless requirements are exceeded). You determine and set limits and let API server know this. So if I have 10 chat-rooms and they hold maximum 100 users and benching my WSS server proved I could hold 400-500 of them. With that information I'd set 4-5 rooms per server. So if two people enter room#1 they are on WSS server#1; If all 10 chat-rooms are full then WSS server #2 is now full and 11th room will need a WSS Server#3 up to 15th room.
The slowest part of the network would now just be your API server handling requests but this may include database as well.
If your requirements are for more users than the example, you can increase core power first or add a second server with help of an MQ or Redis Pub/Sub type setup.
Unfortunately there's no way to properly sort users, so if 3 rooms had 20 users and all were sitting on WSS server#1 that'd still leave a room left with hundreds of user slots available but is this really a problem?
It's possible this room could fill right up so leave them the spot, but still could be days till they max so programming something spicy for your needs will improve how cost effective you make it.