Currently, I am carrying out some HTML processing before storing the data in the database. If a user pastes content containing HTML tables, I need to eliminate certain tags and attributes.
To extract the table content, I am using
content.match('<table[^>]*>(.*?)</table>')
. The content includes a width tag as an attribute and also within a style tag. Example: <table width="462" style="border-collapse: collapse; width: 348pt;">
.
My goal is to transform the content to something like
<table style="border-collapse: collapse;">
. However, I do not want to remove the width attribute or tags inside tr
and td
. Can anyone recommend a suitable regex pattern for achieving this?