OMwiki:Tech

(Difference between revisions)
(New Features: facial recognition: better to start with audio)
(New Features: add content delivery network)
Line 29: Line 29:
*[http://lists.wikimedia.org/pipermail/metavid-l/2009-August/000055.html Email thread on MetaVid-l]
*[http://lists.wikimedia.org/pipermail/metavid-l/2009-August/000055.html Email thread on MetaVid-l]
-
*Implement a Vorbis-only option on video streams for low-bandwidth connections.
+
*Add a Vorbis-only option on video streams for low-bandwidth connections.
-
*Think about replacing animated GIFs with low-FPS, clickable, enlargeable Theora.  Alternatively, use [http://pad.ma/ Pad.ma]-style mouse-overs (see [https://wiki.pad.ma/browser/padma.dev/padma/static/javascript/info.js?rev=padma.dev%2C354&order=size&desc=1 <nowiki>$(imagePoster).load(function() {...})</nowiki>]).
+
*Replace animated GIFs with low-FPS, clickable, enlargeable Theora.  Alternatively, use [http://pad.ma/ Pad.ma]-style mouse-overs (see [https://wiki.pad.ma/browser/padma.dev/padma/static/javascript/info.js?rev=padma.dev%2C354&order=size&desc=1 <nowiki>$(imagePoster).load(function() {...})</nowiki>]).
-
*YUV4MPEG support in [http://www.blender.org/ Blender], to send to [http://v2v.cc/~j/ffmpeg2theora/ ffmpeg2theora] ([http://lists.mplayerhq.hu/pipermail/libav-user/2009-March/002639.html email thread])
+
*Add YUV4MPEG support in [http://www.blender.org/ Blender] to enable direct output to [http://v2v.cc/~j/ffmpeg2theora/ ffmpeg2theora] ([http://lists.mplayerhq.hu/pipermail/libav-user/2009-March/002639.html email thread])
*Squash bugs (esp., video non-playback) in mwEmbed's [http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/js2/mwEmbed/libSequencer/ libSequencer].  
*Squash bugs (esp., video non-playback) in mwEmbed's [http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/js2/mwEmbed/libSequencer/ libSequencer].  
Line 39: Line 39:
*Identify a way to output time ranges generated from multi-speaker audio files, with each time range corresponding to the duration of how long a person spoke for ([http://cmusphinx.sourceforge.net/ CMU Sphinx?]).  Hopefully add speaker identification based on the vocal profile, and maybe even facial recognition.
*Identify a way to output time ranges generated from multi-speaker audio files, with each time range corresponding to the duration of how long a person spoke for ([http://cmusphinx.sourceforge.net/ CMU Sphinx?]).  Hopefully add speaker identification based on the vocal profile, and maybe even facial recognition.
 +
*Figure out a content delivery network.
<br>
<br>

Revision as of 14:59, 20 September 2009

Known issues:

(Listed in order of severity.)

  • Videos don't always start playing automatically after 'play' icon is clicked. Bumping the seek marker slightly to the right helps. Might be related to doActualPlay in nativeEmbed.js.
  • Find a way to add style="width:352px;" to embed code in 352×240 streams.
  • Search works but has a few issues (punctuation stripped, 'play inline' plays audio although video is paused, search-by-date not yet enabled for all videos)
  • Video playback might take ~6 seconds to start for timecodes towards the end of the meeting (pending seeking support in oggz-chop)




Scripts needed:

  • Retool download_from_archive_org.php to work with different source videos and meta tags from the Internet Archive. Simultaneously generate Media RSS <items> from the same information, and insert date_start_time into mv_streams.
  • Script to import MetaVidWiki clips from apps that can produce .srt or .cmml captions (e.g., Gnome Subtitles)
  • Find a command-line script that produces .torrents that include multiple HTTP seeds, no tracker, a comment, and works with Mainline.



New Features

  • Add a Vorbis-only option on video streams for low-bandwidth connections.
  • Squash bugs (esp., video non-playback) in mwEmbed's libSequencer.
  • Identify a way to output time ranges generated from multi-speaker audio files, with each time range corresponding to the duration of how long a person spoke for (CMU Sphinx?). Hopefully add speaker identification based on the vocal profile, and maybe even facial recognition.
  • Figure out a content delivery network.


Communicate:

All videos and text are published under the CC-BY 3.0 U. S. or CC-BY-SA 3.0. copyright licenses.  Details.