# html-tokenize
**Repository Path**: mirrors_regular/html-tokenize
## Basic Information
- **Project Name**: html-tokenize
- **Description**: transform stream to tokenize html
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-09-25
- **Last Updated**: 2026-05-10
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# html-tokenize
transform stream to tokenize html
[](http://travis-ci.org/substack/html-tokenize)
# example
``` js
var fs = require('fs');
var tokenize = require('html-tokenize');
var through = require('through2');
fs.createReadStream(__dirname + '/table.html')
.pipe(tokenize())
.pipe(through.obj(function (row, enc, next) {
row[1] = row[1].toString();
console.log(row);
next();
}))
;
```
this html:
``` html
blah blah blah
| there |
| it |
| is |
```
generates this output:
```
[ 'open', '' ]
[ 'text', '\n ' ]
[ 'open', '' ]
[ 'text', 'blah blah blah' ]
[ 'close', '' ]
[ 'text', '\n ' ]
[ 'open', '' ]
[ 'open', '| ' ]
[ 'text', 'there' ]
[ 'close', ' | ' ]
[ 'close', '
' ]
[ 'text', '\n ' ]
[ 'open', '' ]
[ 'open', '| ' ]
[ 'text', 'it' ]
[ 'close', ' | ' ]
[ 'close', '
' ]
[ 'text', '\n ' ]
[ 'open', '' ]
[ 'open', '| ' ]
[ 'text', 'is' ]
[ 'close', ' | ' ]
[ 'close', '
' ]
[ 'text', '\n' ]
[ 'close', '
' ]
[ 'text', '\n' ]
```
# methods
``` js
var tokenize = require('html-tokenize');
```
## var t = tokenize()
Return a tokenize transform stream `t` that takes html input and produces rows
of output. The output rows are of the form:
* `[ name, buffer ]`
The input stream maps completely onto the buffers from the object stream.
The types of names are:
* open
* close
* text
cdata, comments, and scripts all use `'open'` with their contents appearing in
subsequent `'text'` rows.
# usage
There is an html-tokenize command too.
```
usage: html-tokenize {FILE}
Tokenize FILE into newline-separated json arrays for each tag.
If FILE is not specified, use stdin.
```
# install
With [npm](https://npmjs.org), to get the library do:
```
npm install html-tokenize
```
or to get the command do:
```
npm install -g html-tokenize
```
# license
MIT