If you ever need to convert HTML+CSS (such as that created by WordPerfect or Word export) to generic TEI, you can write a JavaScript function that e.g. gathers up all the <span>
s and iterates through them one at a time in an array, and then you do something like this to them. (What’s an array? For this purpose it’s a matrix with one row, i.e, an ordered list of things/entities. Looking up “array” itself on Wikipedia only proves the tired axiom: if you have to ask, you don’t know.) Anyway, something like this:
if (currentStyle == "font-variant: small-caps") {
currentSpan.removeAttribute("style");
var hi = xml.createElement("hi");
for (var k = 0; k < currentSpan.childNodes.length; k++) {
hi.appendChild(currentSpan.childNodes[k]);
}
hi.setAttribute("rend", "small-caps");
currentSpan.parentNode.replaceChild(hi, currentSpan);
}
Most of that should be human-readable, though since I haven’t included the entire function, the distinction between variables I’ve named and js syntax isn’t clear. Basically: if the current value of the variable named currentStyle
is exactly “font-variant” blah blah, then do the stuff within braces—that is, knock the attribute named “style” off the current span (currentSpan
is a variable), and create a variable named hi
and store in it a newly created element <hi>
.
Then there’s a for
loop: make a counter called k
, use it to count how long the array of the current span’s child nodes is (i.e., if this <span>
has text in it only, that’s one node; if it has text and another <span>
, that’s two, probably)—and for each child node, append that child to the newly created <hi>
. Keep doing so as long as you have matching k-counters left.
I have no idea how comp sci came up with this counter method, but something similar is used in C/C++, Java, and other major languages; consider it akin to a literary topos. Then—noting the closing brace that says we’re finished with counting k
—set this <hi>
to have the attribute @rend="small-caps"
. Finally, take the content of the current span and pop it into the <hi>
.
We are actually within another for
loop, which I didn’t paste, and it ensures that the whole wordy sequence is performed for each <span>
in the target X(HT)ML document. And—full disclosure—the if
statement there is an else if
in the original because I’ve rolled several <span>
conditions together into a function I’ve named chartidy
, for tidying character-level formatting.