0:00:10so i'm i don't i totally don't even need this microphone "'cause" i'm actually really
0:00:13loud what we're gonna talk about today is tag to pdfs and all the we're
0:00:19gonna mention accessibility a little bit is actually is an accessibility presentation and the fact
0:00:26that the accessibility logo appears on every single slide is just me spacing that out
0:00:32and using the same temp let's i always still but our agenda is to talk
0:00:37about tag pdfs what they are why we want them and how to make them
0:00:43because once you find out what they are and why we want them
0:00:48finding out that it's really easy to make them is gonna be quite
0:00:53important in my opinion we talk about the current status of a project are actually
0:00:57going to be talking about which is implementing detect pdf support and then we're gonna
0:01:03demos are always subject to the double what fact and what we would be able
0:01:08for you is totally screenshot of so we have screen shots we can provide demos
0:01:13upon request and you fix the crash so we should be okay without but we
0:01:18just don't see it as
0:01:21you know absolutely park actually necessary
0:01:24okay so attack P D that's kind of like a P D F on steroids
0:01:28it has a lot more stuff than it lot of met information in particular as
0:01:33you might of guessed from the name tag pdf it has tax element tags very
0:01:37similar and in some cases exactly the same as you see in H T M
0:01:41L
0:01:42there are also I Ds for every single element
0:01:46there's alternative text if it's provided and there's also i have to kinda google to
0:01:51find out exactly what this means from the spec replacement text for symbols the spec
0:01:56expect pretty
0:01:57flexible but the first thing or the one above it alternative text for images and
0:02:03other things like that those are like big phrases and descriptive
0:02:09whereas for symbols
0:02:12you could characters a single letter that's a graphic those kind of things so the
0:02:17replacement text is assumed according to the spec to be a single character and that's
0:02:21what the differences between those two things
0:02:24okay right we want them and i told you i was gonna talk a little
0:02:27tiny bit about accessibility what's motivated or implementing the tag speak with the tag P
0:02:34D F's
0:02:35is accessibility
0:02:37because there there's a couple problems with advance as you guys may have noticed from
0:02:42our recent friends of good
0:02:43for into can campaign and i can't
0:02:48relate
0:02:51it's so know just been fired from the board
0:02:55for the friends of get numb campaign it's just a big old we need to
0:02:59make the stuff accessible and part of that is no i believe zero accessibility right
0:03:04now part of it is for non tacked pdfs getting the minimal level of accessibility
0:03:09i want that will give users who are blind is accessed at about the level
0:03:14of G at it
0:03:16and as you know G at it doesn't have things like heading and list items
0:03:20and stuff like that it's just text so having just text is better than having
0:03:24nothing at all
0:03:25and we're not gonna talk about that aspect of what we've been working on but
0:03:29there is so much semantic and structural information in X M L mark
0:03:36marked up mark up marked up documents like that something is a heading level one
0:03:42or heading level to or a table and how many rows and columns are in
0:03:45that table these were all things this site as you
0:03:48these my stuttering a sighted users
0:03:51we just look at and them you know we know what's going on if you're
0:03:55user who was blind your screen reader needs to present that in speech and in
0:04:00braille and in order for that to happen we need access to all this that
0:04:04information and we don't have it because popular has absolutely no support for tech pdfs
0:04:13by the way thanks again friends of you know this is lots of all some
0:04:17stuff related to this whole project is happening because of that campaign
0:04:22now
0:04:24honestly i could care less about some of these things i am totally motivated by
0:04:28accessibility but i think the rest of the community should really care about tech pdfs
0:04:32for because of the items on this slide you know we keep talking about you
0:04:37know
0:04:38are we still calling it can all my last but we're talking about getting on
0:04:41tablets and smaller devices with smaller screens
0:04:44and if we can a tree flow these documents on you know a little and
0:04:49very phone or if i don't phone that looks like my hundred phone if you
0:04:53know the whole pdf experts is gonna suck on the mobile platform so if we
0:04:58have tag pdfs we could really easily do re well
0:05:04then the rest is the copy paste and export if i select text from an
0:05:10a beautifully market pdf that be really cool if i don't lose not just that
0:05:14formatting
0:05:16but you know the fact that something is a heading
0:05:18the fact that something is a table if i just get taxed
0:05:21i might have to reformat it or at that point i might just you know
0:05:24it might not be worth that i can type fast
0:05:28okay so there's hundreds in the bad news because this is on the right if
0:05:33you use lee profits
0:05:35it and i'll show you a screenshot and then it is like to burke easy
0:05:38it's like one step to export your document to tag pdf
0:05:43so that means if you care about accessibility if you want the reef well if
0:05:47you want to copy and paste it's a single step but that this is like
0:05:50no little supports this alex said that a bunch of research the other day about
0:05:54that's it it's like and they work now but i did it's literally were just
0:05:58like no it's you know sky like soup nazi on seinfeld that X P D
0:06:02F for you know there is and you know better than i do some people
0:06:05are interested
0:06:07that in fact is possible indication of will be looks at the documents is fine
0:06:12because i've so you can just have to unpack pdf you can this problem to
0:06:19all the T and they usually we ought to us but pca that means that
0:06:25will docks cost all the is to all the information format
0:06:29i don't know we looks can supplant respond to that idea but in because of
0:06:33lactic there are a lot of interesting that because after all for example in that
0:06:38kind of mix
0:06:41they want to i mean is in is in a lot of faculties is mandatory
0:06:45product to see have to complain it pdfs so they need that they actually really
0:06:52interested because we want
0:06:55mathematics be accessible
0:06:58something that is not happening right now in the case of it works is also
0:07:03happened same you when you say that's a P D F is there is no
0:07:06provision to greatest target pdf bought you can spoke to all the we develop this
0:07:14great that we did show them what i mean is that well just to do
0:07:17is there so they are not using that and then there a lot of tools
0:07:21that great E D S but they don't have in your children in the rest
0:07:24of we don't know them tutors if i would have leds thereof
0:07:29tools but then find then we will ask i would be an effective because they're
0:07:35this is what happened with other tools to
0:07:38i was kind of funny i said you know alex can guess this list of
0:07:42all the tools and it's kind of like okay all these are now may i
0:07:44stop now
0:07:46it's like you can stop so there may be other tools out there that do
0:07:49have this support that we found one it has perfect quick support i got a
0:07:54whole bunch that doubt and then alex got tired
0:07:59okay i'm hoping a you can ever get like an entire dialogue and big enough
0:08:04but i'm hoping you can see the one little thing that's czech that's attacked pdf
0:08:08and so basically all you have to do with sleeper office and this is for
0:08:11brighter calc and it impress and presumably the others you just have to choose export
0:08:17to P D F from the file menu dialogue pops up czech tack pdf as
0:08:22it wanna point out is that right about that czech boxers pdf slash a one
0:08:27and a all standards need to have lots and lots of letters and numbers on
0:08:30them i don't know why that is actually you know we could be said that
0:08:34attacked pdf a skylight pdf on steroids well this P D F a one a
0:08:39it's kinda like pad pdf on steroids
0:08:42and what it gives your excuse me the objective is for searching and quantum plugged
0:08:47re proposing document content
0:08:50and it includes and i don't understand this and standards at all i would think
0:08:54that be would include a rather than a including be but for some reason the
0:08:58one a includes one be and that like it's put on the slide it's about
0:09:03document appearance
0:09:06and we're gonna come back to that point in the minute structure and hierarchy you
0:09:10know other child objects you know what are the parent objects you know what it
0:09:15is actually a tree
0:09:17tacked pdf which we've already talked about unicode character maps to be perfectly frank id
0:09:23to go read the big sixty enter to know exactly what that means and what
0:09:28is expected and language specification just what like which is something
0:09:35so in terms of the current status of the work that's been done
0:09:39popular right now some of that's and master a some of it's in a branch
0:09:42but right now the entire document structure of tact pdfs
0:09:46it's been implemented for poplar works posing all of this information for popular G lab
0:09:54and there's some that if you who are like popular geeks there's all these different
0:09:58tools to examined documents and information about the altars and all that stuff and those
0:10:05tools of been modified to expose this information so that you can
0:10:10verify that attack pdf is coming out the way you expect and popular
0:10:16okay we have not yet got into doing anything with this in advance that's gonna
0:10:22be the next main step and i put in parentheses you know then doing this
0:10:26for accessibility because again this is this is what motivated all of this work
0:10:31it's not what's also about all of this work that's what motivated it and we
0:10:35can't of course expose what is not in advance to assistive technologies but that's the
0:10:40next priority after about some
0:10:43and terms of all the support that we've done the reason i put a question
0:10:47mark by the are one be part
0:10:50the one be part of the standard is because as you'll see in the fake
0:10:54demo screenshot that's
0:10:56we're already totally now able and exposing or preserving rather of the formatting
0:11:03so what are not we done every last aspect of the standard i couldn't tell
0:11:07you
0:11:08i again i would have to read that standard very carefully but at least functionally
0:11:12i believe that we have all that implemented
0:11:15that pdf again is already done word exposing the hierarchy since i'm not entirely sure
0:11:21what the official definition and spec requirements are for unicode character maps i can tell
0:11:26you
0:11:26and we
0:11:29and we are already exposing these the language if there is a language associated with
0:11:35an element
0:11:37so i kind of already set this particular next steps are to do the a
0:11:41bedside work
0:11:43expose that to assistive technologies
0:11:45and figure out exactly what the status is what the standard the last thing on
0:11:49their this is i think it'd be one of those patches welcome some people just
0:11:54i personally lovely profits some people don't because it's is big joining at the time
0:11:59slow creature there is nothing we don't have time or the expertise quite frankly the
0:12:06motivation to add export support in all of the tools that are currently no but
0:12:12if you sell your favourite tool on there like that the community would welcome a
0:12:16patch
0:12:22and information with the previous like that about
0:12:27sport in the
0:12:30to use in the that previous this is something like the F on that you
0:12:35can problem
0:12:37when we were when we were talking a lot about sort of also well we
0:12:40use it is that they did he say that the well the results they were
0:12:44not
0:12:45your it is but i seen having target the previous for really target previous you
0:12:51because one of the tools web upgrading the and the same in this a way
0:12:56tools upgrade pdfs
0:12:58well not with the really what is allow great impacted bps because
0:13:03the P D of re a previous will not able to with a that's what
0:13:09so i want one is the other reason that colours mention is that it's like
0:13:14he said nobody knows what attack pdfs and that was one of the things that
0:13:18motivated me you know telling you what it will be a why you should want
0:13:21to make add pdfs but i would kinda argue that even if all you guys
0:13:26start making packed pdfs there is actually still a lot of usage going out there
0:13:30because the legal requirements you know if you are legally bound to have all of
0:13:36your documents accessible euro you already know that had pdfs you are already producing them
0:13:42so even if no one else is making had pdfs governments on educational institutions another
0:13:48big people who could be sued for not providing documents that are accessible they're already
0:13:53doing that at pdf colours to know that
0:13:58okay and how going to the point of this is like that is getting because
0:14:02this funny because now is them moment that we talk about
0:14:07what is the world that
0:14:09we have done but the three that within that do that but what so we
0:14:14would like science i we imprisoned impetus
0:14:18just heated up to work because we are really hard not experience are doing document
0:14:23or something
0:14:24we are calling about him document but just every night
0:14:27so i thought that was most appropriate to go sweetie
0:14:32we thought that was a perfect the best for him and we also though the
0:14:39world was made by kind of you have at least a month in a piranha
0:14:42beams are you was a review want to call so
0:14:47i'd say and if someone is that some of the cold is right now i
0:14:52did but we were repository right now we so it's up to parents in
0:14:58you also in some but just some box and in fact some of the bass
0:15:02were and a
0:15:05quickly is this we we're where we were providing the demonstration because we were
0:15:11using tools in fact the fish buttons about providing the tools with the support of
0:15:18the or the target pdf and the other this at a little bar about a
0:15:23meeting is it huh
0:15:28right now
0:15:31but in the previous weeks kind of the of was W more the call so
0:15:34some of the buttons on that on that runs is already a master and is
0:15:39going is and only be remote
0:15:42and we have a that is life i mean we are we are the know
0:15:47that and miss sorting a how i passed
0:15:52how corpus is a task on light of quality strong what's easy but we want
0:15:57to we added up that as light because
0:16:01we wanted to and
0:16:04give is probably to that is not a trivial task
0:16:09and you have thought that is almost as much work import we have called on
0:16:13them but we really because this is also in a important
0:16:18as we are as the others
0:16:21the support them pop of course that means that eventually some other a three communities
0:16:27but so what communities like cute well
0:16:30date is work and we say so we have
0:16:34it is another example of whatever you some with all the official what communities
0:16:40mm what i've this is what i'm saying i mean it's not very important how
0:16:44much the lines about i mean the model sentences and i some popular three thirty
0:16:48thousand nice some public really means that is not really a at work bone at
0:16:53this is moment
0:16:55so
0:16:57S in the advertisements what we have it for what we have after so before
0:17:03we have i to call pdf info that just went something in for a lot
0:17:09of P D F also to cetera and then we have some tools that's great
0:17:14is based on the P D F
0:17:16it quit is based on the text
0:17:19but was think there's without stupor and then we have another to that quick a
0:17:25nice meal
0:17:26in fact this is
0:17:28well the reason we have that we mentioned that tool is because now event on
0:17:32the look at least some people ask i would like to with the P D
0:17:37F that i come to that with it being is not about what is what
0:17:40we have to
0:17:41and use it all cases is other people recommends to use P D F please
0:17:46mail to take the pdf on which is to me a but the problem with
0:17:51that is that you get a P D F without not something that you get
0:17:55that it's to me it without noticed after
0:17:59and this is a sample
0:18:01i you at the i don't have to concede that during the pdf with here
0:18:06somewhere that the stuff and at the right it's the mail that you
0:18:11but before that you have work and you know it is the same at with
0:18:16i'm not and formatting at all i mean okay you have the ball but engaged
0:18:22me that you have you don't have the information about the kids for example if
0:18:26you if you use a table
0:18:29with that tool you who you will have only they can to do that they
0:18:34yes
0:18:36just print just without any formatting so you don't know if you aren't it is
0:18:40the only in this a controller on the second
0:18:44and the second column
0:18:48well this is
0:18:50exactly this is what i was saying i thought also a
0:18:55well i think that's probably for you would be for you can just a breeze
0:18:59but if you if you thing they do pdf
0:19:03when we pulled is this you can see at the right that you have here
0:19:08is but only text a lot right so is without any formatting at all
0:19:15so i can as a so we were saying we are going to support
0:19:18and sorry we as are the and countless other that report of target pdf from
0:19:23popular so they are so it'd to similar to use that support so i'll be
0:19:28and weighted up to call pdfs to know that
0:19:32if you don't do that there is one but you can things that this the
0:19:36stupor the just to the to be have can things
0:19:40so i've have some
0:19:43sorry some you have a chance to be pdf in for you know the two
0:19:46point just trooper finally
0:19:50to our problems did it you know emotive to display to here
0:19:57sure
0:19:58sure we have the same example
0:20:01okay left you have to pdf and the right to have this communal but now
0:20:05you
0:20:06it meant to system to maybe so
0:20:09well this is what i alluded to before it that if you remember before the
0:20:14all of the formatting was lost that this gets back to that one be standard
0:20:18that i need to read very carefully to see if we're not implementing any aspect
0:20:23of it but this time the you know like the italics a man heading to
0:20:28that carried over so there's and like i said there's an excellent chance the bulk
0:20:32of thee one be standard is now implemented as well
0:20:42i don't is working
0:20:45and this is a screenshot of and the tool i mean and in that here
0:20:51is easy if there's to see that now we have
0:20:54a is true or of the document because we will have different blocks with this
0:20:59H one means he there so we can see that it meant based approach but
0:21:04probably it's you have to see with
0:21:08with the department did you know
0:21:11and we can see at
0:21:13there right about of the right the screenshot that we can see this is to
0:21:17prove using a at review
0:21:20that show that you we have
0:21:23a best and it meant that is heating and element that's a bar find them
0:21:27and that's another here
0:21:32we can see a next argument the sample with these items but i've got women
0:21:37thing
0:21:38the then the nist at this to put in fact it is what i'm saying
0:21:43with the previous business well we have the tooling was suggest getting the text a
0:21:48printing need what printing on is pushing it but without mundane interceptor
0:21:57and this is the case of the of i think that as you can see
0:22:02bits to me and maintains is that was a table we have to can do
0:22:05we have all the roles and with a put this is a tools we only
0:22:09half it text
0:22:14you but
0:22:16hello
0:22:17you can see the and if you for that is a terminal the mel to
0:22:23that you to maintaining the different
0:22:26for all okay there's from the table
0:22:31i think that
0:22:33as you can see we are not doing a club time you know because we
0:22:37have a really expose some these kind of presentations so we know that when you
0:22:42do that we have minutes work it just depends to quest so we had this
0:22:48question service as an see as i say you can
0:22:51in service we are going to probably space
0:22:55as limestone you contributing to tell you one so and you question
0:23:10i you said one of the benefits of the tags was
0:23:14for refund
0:23:16and me and just
0:23:18like revealing the ignorance here but i thought that one of the benefits of P
0:23:22F was the it was always
0:23:25like print perfect basically me view it so i was wondering like how the idea
0:23:29every for fits into like the typical pdf
0:23:33is case like it like the fallback mode or is it
0:23:38i mean this is funny because when we were we have some for sits for
0:23:42that one people website the same the same on the forums was saying was saying
0:23:47that we did well not sort of all that maybe a puppy da was maintaining
0:23:52that there for you
0:23:54but
0:23:55busily that in the end
0:23:57you want to see the document don't know but there's no way of what is
0:24:02the size of your screen show in what i mean is
0:24:08for some local in this it's true that it doesn't make sense tutor for all
0:24:12but from others
0:24:14i mean it will be really strange if you need to force to share to
0:24:17make a sort of this illness after don't want to see that probably
0:24:24and thought there's some doubts about how to do that a how to probably that
0:24:30but maybe a is having some
0:24:34something similar to what it's demand has right now i mean in addition
0:24:40and estimate happens and you have the roles are hard you move the mobile phone
0:24:45is a about overflowing
0:24:47that this
0:24:48i mean i know that this
0:24:51it's just train is actually say because they are you know idea of the P
0:24:54D F was that but we also need to like take into account that these
0:24:59that pdfs support was obvious
0:25:03rescinded is less a presently you have if you have for the some yes
0:25:09but they needed in the same way that there are you know that a proposal
0:25:13to pdf was not was it was having something so more for really that that's
0:25:20great but size then it just added to about to something like something more well
0:25:26how to i don't know how to say that my documents friendly
0:25:34gonna say something and that might be totally not real that like kerry as if
0:25:40you a say a bullet is last
0:25:44or a table and none of the if the fact that that's it actual you
0:25:49know structural list or actual structural table is not known
0:25:54what how was that tax cut around is it going to manage to and didn't
0:25:58itself so it's not under the bullet on the second line or not and if
0:26:03it's just you know where i mean i think lists items are gonna be the
0:26:08the biggest use case but
0:26:10my going but you know tape i'm so you know what i'm really liked so
0:26:16that which is why i wear
0:26:18sunglasses
0:26:20i just was wondering
0:26:23no if you will
0:26:27i don't know we are written that actually question more
0:26:32i guess i can see
0:26:35we are getting
0:26:38one
0:26:39i just to repeat myself i think the but what instances the biggest one because
0:26:44in
0:26:45we can actually put these on the screen but before we did attack pdf support
0:26:50people like character is strictly a character just like any in a and B S
0:26:54C or whatever and wrapping that taxed will work but if we don't treat it
0:27:00as an actual list item the indentations gonna be screwed up and it's gonna look
0:27:03ugly
0:27:08i have another question to actually
0:27:12so you should a list of
0:27:14of tools the
0:27:16can help protect pdfs
0:27:19earlier in your presentation
0:27:24i was wondering if you also look that
0:27:27other
0:27:28consumers of P F it support take pdfs and puts the support like
0:27:33for
0:27:35so like the rear probably supports I P F but also like be this pdf
0:27:39that yes support type you have placed in that also sort of drive adoption
0:27:44something
0:27:48really don't question because i think that we didn't get it
0:27:51so those tools are tools that make pdfs there and support I P F
0:27:57what tools that read P F support i our which class okay
0:28:01we did we
0:28:03but in the little world and if just what was in the pitch so we
0:28:08will
0:28:09it means will be the first one that will support that because okay we did
0:28:13it is that because this is official where eve and so you know which will
0:28:18talk about our spares but and in the rest of the world show windows are
0:28:27that the stuff
0:28:28and i've got we will unlock what about the right there is
0:28:33the they wouldn't course
0:28:36for example because of well support a spell target pdf
0:28:44and at this address something that provided above is that is that is that how
0:28:51it but also that
0:28:53that we sorry so about be provided a planning or something like that to make
0:28:59these we have to pdf
0:29:02and now
0:29:04because so we have these already doing that so
0:29:08i mean thought in
0:29:10a lot of
0:29:11and government pages talking about and how the pdf should be so should be accessible
0:29:18base a use accurate right that will be sort of that these target and
0:29:25make sure that using that growing up right that we is a screen we are
0:29:30that it it's work fine
0:29:33and we there was of the pose well it is happening all is the same
0:29:38the saying that means that they don't support like pdf in but for number one
0:29:43of the samplers is this pdfs to do it it's to use of the word
0:29:47for windows will about us is aware set of these is having the same thing
0:29:52this is but it's also say is a spot to pdf but is not target
0:29:57so
0:29:59well we well this work with a tinge finished
0:30:04we mean that up to fish or what communicate we will have one tool that
0:30:08properly as well too bad pdf that is little things i want to that is
0:30:12the teams that probably we target pdfs that would be more or less the same
0:30:17situation that in the winter that is probably right about it
0:30:22we
0:30:25that the difference in terms of windows is that windows has had the ability with
0:30:29it screen readers to provide access to tag pdf something to lose track but over
0:30:35again kate now and that you know i know this is an accessibility talk again
0:30:39that that's i can't separate myself from accessibility it's really embarrassing when people in the
0:30:45work a list say you know i can use our cup with firefox and leave
0:30:49are often do all this stuff and people say well what do you do to
0:30:52read pdfs and some people say you know do the P D F to H
0:30:55T M L that you lose all the heading information but what a lot of
0:30:58people say as i bit into windows and i used jaws
0:31:02so we were solving well we're actually by not having pack pdf support we're actually
0:31:07sending people off to use non free software and now we're solving this problem
0:31:14okay and a question sorry we are we are similar question
0:31:19okay thank
0:31:35what do you think it next most important thing is to accessibility
0:31:41in get a but not really with you have
0:31:50exactly and the next is sort of the
0:31:54if i have a real that was will verify from on this where it was
0:31:57not able to go to a presentation about well to the president about was able
0:32:03to or worse will because the next thing for the next challenge for the so
0:32:11this team will be the will answer about right now it is pi two
0:32:18well it's with X in five first of them this is with X so that
0:32:22made this bass will be
0:32:25making sure that all other stuff works with waylon
0:32:29in fact agree you have to dean is somewhat behind the other team's because as
0:32:37far as the normal but down from can still people use
0:32:41grading us some kind of this buddy mental bronze we the way that's about so
0:32:47that means that without if you use the plants you can there is
0:32:52probably is cost problem or something like that i don't know well the that if
0:32:56you something to but i a list they
0:32:59they will provide to the users
0:33:02i way to this can so long way to and we don't have that on
0:33:05it's pi so this is the next thing
0:33:11is okay thank
0:33:15and more question
0:33:26i so i just i mean i know idea not has been video so does
0:33:32when they is this is related it and to something like pdf innovations is it
0:33:37could be used to something similar or if it's at a bit and related thing
0:33:43so
0:33:44i think that the different stuff and i think you know five anything that you
0:33:48bins racial but adaptation
0:33:52so
0:33:54well i we supported a little bit but there was a google summer of code
0:33:59project that sounds like it might not continue to do some work on that and
0:34:03there's been a lot of discussion around making i don't know the specifics but around
0:34:09making annotations way more cool so it sounds like they're that we have a little
0:34:14support for it but we need a whole lot more but in terms of since
0:34:17i've ovaries turned this talk into an accessibility talk now annotations to my knowledge has
0:34:23nothing to do with tack pdfs in terms of accessibility after the annotations get implemented
0:34:29we're gonna probably have to do a similar accessibility implementation
0:34:33okay thank you
0:34:41so i think that
0:34:44no well thank you for being if you have been every question
0:34:48you are