Text chat is relatively simple. It involves three tier architecture. 1) Javascript timer. 2) WCF Ajax Enabled web service or Generic Http Handler, 3) Data Storage (Preferably SQL).
1) On the page - sending: input text box + button (used to send). The button click event handler or the text box's key down (for enter key) and blur events would invoke a post (via JQuery, plain ol' JavaScript or whatever Javascript library you use) to the WCF service/Generic handler, sending the contents of the text box, along with the chartroom name, the addressee, and the recipient.
2) On the server: WCF Service/Generic Http handler receives the post and stores it in DB.
3) On the page - receiving: using JQuery for example, you would create a javascript timer on document ready (when the page loaded). On every timer's tick event you want to create a GET (or post) via your handy JavaScript framework (or Plain Javascript) to your WCF service/ Generic Handler requesting the latest records stored in the DB for that chatroom. Append the result received (assuming xml/html/json) to the Div or whatever element is used to display your "chats".
This is a very simplified chat in jquery/asp.net.
As far as audio-video is concerned, you have a few problems. 1) The browser itself has no means of interacting with the mike, speakers, and video camera, unless it uses a plugin. Moreover, browsers typically have no way of knowing how to decode a video stream (though some of the smarter ones have it built in... chrome, firefox). 3) Javascript has no way interacting with all the necessary hardware as it lives inside the browser.
All that said, you can use a plugin such as Flash or Silverlight, (that has built in access to the necessary hardware), or whatever. You will also have a conceptual dilemma with those as you have to simultaneously deal with 2 streams - one for coming in, another going out and displaying both. However it is possible.